You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apr.apache.org by Uwe Zeisberger <ze...@informatik.uni-freiburg.de> on 2004/11/18 17:22:50 UTC

[PATCH] fix apr_xlate_conv_buffer and testxlate

Hello,

I found a testxlate in the test dir of apr-util, which is not build and
fails to correctly convert to UTF-7.

One reason is, that the trailing '\0' is translated to. An other is,
that iconv cannot convert (in some cases) a buffer completly without
knowing, that the string really ends at the end of inbuf. This only
affects encodings, where in one (encoded) byte may be information for
more then one decoded byte (i.e. UTF-7).

Below comes a log and a patch. I assume, that apr_xlate_conv_buffer
should convert inbuf and assume, that the strings ends then. Else its
use in test/testxlate.c is not appropriate.

(I'm on GNU/Linux using gcc 3.3.4 (Debian 1:3.3.4-13))

Below is the log message and an patch is attached.

Regards
Uwe

BTW: you will not need the .cvsignore files, which are in the subversion
repo, any more.

[[
Fix apr_xlate_conv_buffer when using iconv()

* xlate/xlate.c
  (apr_xlate_conv_buffer): terminate conversion correctly by adding the 
  terminal shift sequence.

* test/Makefile.in
  add a build rule for testxlate and add testxlate to PROGRAMS

* test/testxlate.c
  (test_conversion): instead of converting the trailing '\0', append a
  '\0' to the result.
]]

-- 
Uwe Zeisberger

http://www.google.com/search?q=2004+in+roman+numerals

Re: [PATCH] fix apr_xlate_conv_buffer and testxlate

Posted by Uwe Zeisberger <ze...@informatik.uni-freiburg.de>.
Hello Joe,

Joe Orton wrote:
> > If an application uses the new convention and is linked using a lib
> > without the change (i.e. 1.0.0) a NULL-Pointer is dereferenced by the
> > assignment of to_convert.  (This is why `if (inbuf)' was added to line
> > 389 of xlate.c.)
> 
> Yes, the new mode is documented in the header to only work with apr-util
> 1.1.0 or later, which is consistent the APR versioning requirements. 
> The application will have to check the version at compile-time or
> run-time - apu_version.h allows both.
Ah, I only looked over the patch you posted - not over the commit you
made.

Thanks for your quick answer.

Bye
Uwe

-- 
Uwe Zeisberger

exit vi, lesson II:
: w q ! <CR>

NB: write the current file

Re: [PATCH] fix apr_xlate_conv_buffer and testxlate

Posted by Joe Orton <jo...@redhat.com>.
On Wed, Dec 01, 2004 at 12:29:28PM +0100, Uwe Zeisberger wrote:
> Yesterday I noted a compatablity issue with this change:
> 
> If an application uses the new convention and is linked using a lib
> without the change (i.e. 1.0.0) a NULL-Pointer is dereferenced by the
> assignment of to_convert.  (This is why `if (inbuf)' was added to line
> 389 of xlate.c.)

Yes, the new mode is documented in the header to only work with apr-util
1.1.0 or later, which is consistent the APR versioning requirements. 
The application will have to check the version at compile-time or
run-time - apu_version.h allows both.

Regards,

joe

Re: [PATCH] fix apr_xlate_conv_buffer and testxlate

Posted by Uwe Zeisberger <ze...@informatik.uni-freiburg.de>.
Joe Orton wrote:
> On Mon, Nov 22, 2004 at 06:26:33AM -0500, Jeff Trawick wrote:
> > On Mon, 22 Nov 2004 11:34:10 +0100, Uwe Zeisberger
> > <ze...@informatik.uni-freiburg.de> wrote:
> > > Joe Orton wrote:
> > > > Here's what I propose to fix this, anything I'm missing?
> > 
> > it makes sense to me
> 
> OK, thanks, I've committed that fix to the trunk.  Uwe, thanks for
> tracking this down.
Yesterday I noted a compatablity issue with this change:

If an application uses the new convention and is linked using a lib
without the change (i.e. 1.0.0) a NULL-Pointer is dereferenced by the
assignment of to_convert.  (This is why `if (inbuf)' was added to line
389 of xlate.c.)

Bye
Uwe

-- 
Uwe Zeisberger

http://www.google.com/search?q=5%2B7

Re: [PATCH] fix apr_xlate_conv_buffer and testxlate

Posted by Joe Orton <jo...@redhat.com>.
On Mon, Nov 22, 2004 at 06:26:33AM -0500, Jeff Trawick wrote:
> On Mon, 22 Nov 2004 11:34:10 +0100, Uwe Zeisberger
> <ze...@informatik.uni-freiburg.de> wrote:
> > Joe Orton wrote:
> > > Here's what I propose to fix this, anything I'm missing?
> 
> it makes sense to me

OK, thanks, I've committed that fix to the trunk.  Uwe, thanks for
tracking this down.

joe

Re: [PATCH] fix apr_xlate_conv_buffer and testxlate

Posted by Jeff Trawick <tr...@gmail.com>.
On Mon, 22 Nov 2004 11:34:10 +0100, Uwe Zeisberger
<ze...@informatik.uni-freiburg.de> wrote:
> Joe Orton wrote:
> > Here's what I propose to fix this, anything I'm missing?

it makes sense to me

Re: [PATCH] fix apr_xlate_conv_buffer and testxlate

Posted by Uwe Zeisberger <ze...@informatik.uni-freiburg.de>.
Joe Orton wrote:
> Here's what I propose to fix this, anything I'm missing?

Thats nearly the same as the patch I made in my wc.  (Well, I didn't fix
the documentation in apr_xlate.h.)  My code does the same as yours, so
it should be ok.

In my opinion, what's missing is the test, that apr_xlate_conv_buffer
behaves in the same way by using apr_iconv.

Bye
Uwe

-- 
Uwe Zeisberger

http://www.google.com/search?q=2004+in+roman+numerals

Re: [PATCH] fix apr_xlate_conv_buffer and testxlate

Posted by Joe Orton <jo...@redhat.com>.
Here's what I propose to fix this, anything I'm missing?

Index: include/apr_xlate.h
===================================================================
--- include/apr_xlate.h	(revision 106171)
+++ include/apr_xlate.h	(working copy)
@@ -102,8 +102,15 @@
  * @param outbytes_left Input: the size of the output buffer
  *                      Output: the amount of the output buffer not yet used
  * @remark
- *  Return APR_ENOTIMPL if charset transcoding is not available
- *  in this instance of apr-util (i.e., APR_HAS_XLATE is undefined).
+ * Returns APR_ENOTIMPL if charset transcoding is not available
+ * in this instance of apr-util (i.e., APR_HAS_XLATE is undefined).
+ * Returns APR_INCOMPLETE if the input buffer ends in an incomplete
+ * multi-byte character.
+ *
+ * To correctly terminate the output buffer for some multi-byte
+ * character set encodings, a final call must be made to this function
+ * after the complete input string has been converted, passing
+ * the inbuf and inbytes_left parameters as NULL.
  */
 APU_DECLARE(apr_status_t) apr_xlate_conv_buffer(apr_xlate_t *convset, 
                                                 const char *inbuf, 
Index: xlate/xlate.c
===================================================================
--- xlate/xlate.c	(revision 106171)
+++ xlate/xlate.c	(working copy)
@@ -385,7 +385,7 @@
     else
 #endif
 
-    {
+    if (inbuf) {
         int to_convert = min(*inbytes_left, *outbytes_left);
         int converted = to_convert;
         char *table = convset->sbcs_table;
Index: test/testxlate.c
===================================================================
--- test/testxlate.c	(revision 106171)
+++ test/testxlate.c	(working copy)
@@ -53,6 +53,11 @@
                                                 &inbytes_left,
                                                 buf,
                                                 &outbytes_left);
+    if (status == APR_SUCCESS) {
+        status = apr_xlate_conv_buffer(convset, NULL, NULL,
+                                       buf + sizeof(buf) - outbytes_left - 1,
+                                       &outbytes_left);
+    }
     buf[sizeof(buf) - outbytes_left - 1] = '\0';
     retcode |= check_status(status, "apr_xlate_conv_buffer");
     if ((!status || APR_STATUS_IS_INCOMPLETE(status))

Re: [PATCH] fix apr_xlate_conv_buffer and testxlate

Posted by Joe Orton <jo...@redhat.com>.
On Fri, Nov 19, 2004 at 09:03:40AM -0500, Jeff Trawick wrote:
> On Thu, 18 Nov 2004 17:22:50 +0100, Uwe Zeisberger
> <ze...@informatik.uni-freiburg.de> wrote:
> 
> > Below comes a log and a patch. I assume, that apr_xlate_conv_buffer
> > should convert inbuf and assume, that the strings ends then. Else its
> > use in test/testxlate.c is not appropriate.
> 
> it was assumed that the application does not have the knowledge to
> guarantee that complete characters are always passed to
> apr_xlate_conv_buffer(); instead, we only assume that the complete
> stream of data passed by the application over multiple calls to
> apr_xlate_conv_buffer() ends with a complete character (or app gets
> the INCOMPLETE return code on last call and realizes there is a
> problem since there is no more input data which could complete the
> character)
> 
> it is a big burden on the application to know when a character ends
> 
> or am I misunderstanding you?

The app must know when the *sequence of characters* ends, and this is
what needs to be passed through to the xlate layer so it can call
iconv(,NULL,NULL,,) to correctly terminate the output sequence for a
stateful encoding like UTF-7.

I think the correct fix is going to be to have the app pass inbuf = NULL
to indicate end of stream as per iconv().

Does this test case work with apr-iconv, does anyone know?

joe


Re: [PATCH] fix apr_xlate_conv_buffer and testxlate

Posted by Jeff Trawick <tr...@gmail.com>.
On Thu, 18 Nov 2004 17:22:50 +0100, Uwe Zeisberger
<ze...@informatik.uni-freiburg.de> wrote:

> Below comes a log and a patch. I assume, that apr_xlate_conv_buffer
> should convert inbuf and assume, that the strings ends then. Else its
> use in test/testxlate.c is not appropriate.

it was assumed that the application does not have the knowledge to
guarantee that complete characters are always passed to
apr_xlate_conv_buffer(); instead, we only assume that the complete
stream of data passed by the application over multiple calls to
apr_xlate_conv_buffer() ends with a complete character (or app gets
the INCOMPLETE return code on last call and realizes there is a
problem since there is no more input data which could complete the
character)

it is a big burden on the application to know when a character ends

or am I misunderstanding you?

Re: [PATCH] fix apr_xlate_conv_buffer and testxlate

Posted by Joe Orton <jo...@redhat.com>.
On Thu, Nov 18, 2004 at 05:22:50PM +0100, Uwe Zeisberger wrote:
> Hello,
> 
> I found a testxlate in the test dir of apr-util, which is not build and
> fails to correctly convert to UTF-7.
> 
> One reason is, that the trailing '\0' is translated to. An other is,
> that iconv cannot convert (in some cases) a buffer completly without
> knowing, that the string really ends at the end of inbuf. This only
> affects encodings, where in one (encoded) byte may be information for
> more then one decoded byte (i.e. UTF-7).
> 
> Below comes a log and a patch. I assume, that apr_xlate_conv_buffer
> should convert inbuf and assume, that the strings ends then. Else its
> use in test/testxlate.c is not appropriate.

Nice work, I'd noticed this failure before but not tracked it down. 
This looks like a real can of worms - I'm not sure it's necessarily
correct to assume that apr_xlate_conv_buffer() can behave like that. 
The fact that it returns APR_INCOMPLETE implies it is safe to use like
iconv() with multiple passes over a string, but it's not specifically
documented as such.

Subversion's use of the function does use multiple passes but always
passing the complete string as input, which seems redundant at least
with a real iconv() implementation.

Jeff or Branko, any comments?

joe