You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apr.apache.org by Jeff Trawick <tr...@attglobal.net> on 2004/03/30 14:05:22 UTC

Re: cvs commit: apr-util/xlate xlate.c

jorton@apache.org wrote:
> jorton      2004/03/30 01:58:22
> 
>   Modified:    xlate    xlate.c
>   Log:
>   * xlate/xlate.c (check_sbcs): Remove function which made unsafe
>   assumptions (a theoretical issue), and was buggy in not resetting the
>   iconv handle on failure (the cause of at least one real issue).
>   (apr_xlate_open): Updated caller.

there is some additional work to be done with this...  I have a patch to yank 
apr_xlate_sb_get() and modify the various apr-util callers of it, but probably 
won't get it committed until tonight or tomorrow a.m....

apr-util 0.9 will mark the apr_xlate_sb_get() API deprecated, and of course 
your fix to reset the iconv handle on failure needs to be in 0.9

additionally, the single-byte conversion routines need to be reworked to not 
depend on the sbcs table, which now is purely an identity table which is used 
when the to and from charsets are the same

but then the identity table use, which was implemented to piggyback on the sbcs 
"optimization," should be replaced with memcpy() ;)

Re: cvs commit: apr-util/xlate xlate.c

Posted by Joe Orton <jo...@manyfish.co.uk>.
On Tue, Mar 30, 2004 at 08:37:55AM -0500, Jeff Trawick wrote:
> >Actually I thought that the simple removal of check_sbcs() would be
> >appropriate for 0.9 too, rather than wait for someone to find a case
> >where it "does the wrong thing".
> 
> apr_xlate_sb_get() makes no sense without an sbcs check...  it is supposed 
> to be either for
> 
> i. code that knows that what its doing makes no sense if conversion is not 
> single-byte, and just wants to ask to sanity check
> 
> ii. code that wants to select simpler, faster buffer management if 
> conversion is single-byte
> 
> the situation where the app tells us to xlate between "foo" and "foo" is 
> just a small (and humorous) subset of intended purpose
> 
> removing apr_xlate_sb_get() is a change to a stable API; we don't even know 
> if there is any such charset for which current single-byte checking will 
> die, though of course the iconv handle reset bug needs to be fixed

To be clear: if there's a case where check_sbcs() gets a false positive,
"die" means it will just happily hand out incorrect translations.  That
would not be a fun case to debug (maybe a borked iconv(), maybe a user
environment screwup, maybe a terminal issue, etc).

For 0.9 if the issue is to keep EBCDIC platforms hobbling along then
maybe it should be left #if APR_CHARSET_EBCDIC.  I'm not proposing to
remove apr_xlate_sb_get() merely make it return 0 in all cases other
than identity translations.

joe

Re: cvs commit: apr-util/xlate xlate.c

Posted by Jeff Trawick <tr...@attglobal.net>.
Joe Orton wrote:
> On Tue, Mar 30, 2004 at 07:05:22AM -0500, Jeff Trawick wrote:
> 
>>jorton@apache.org wrote:
>>
>>>jorton      2004/03/30 01:58:22
>>>
>>> Modified:    xlate    xlate.c
>>> Log:
>>> * xlate/xlate.c (check_sbcs): Remove function which made unsafe
>>> assumptions (a theoretical issue), and was buggy in not resetting the
>>> iconv handle on failure (the cause of at least one real issue).
>>> (apr_xlate_open): Updated caller.
>>
>>there is some additional work to be done with this...  I have a patch to 
>>yank apr_xlate_sb_get() and modify the various apr-util callers of it, but 
>>probably won't get it committed until tonight or tomorrow a.m....
> 
> 
> Yeah, I was going to commit the more radical changes separately... 
> attached what I have.
> 
> 
>>apr-util 0.9 will mark the apr_xlate_sb_get() API deprecated, and of course 
>>your fix to reset the iconv handle on failure needs to be in 0.9
> 
> 
> Actually I thought that the simple removal of check_sbcs() would be
> appropriate for 0.9 too, rather than wait for someone to find a case
> where it "does the wrong thing".

apr_xlate_sb_get() makes no sense without an sbcs check...  it is supposed to 
be either for

i. code that knows that what its doing makes no sense if conversion is not 
single-byte, and just wants to ask to sanity check

ii. code that wants to select simpler, faster buffer management if conversion 
is single-byte

the situation where the app tells us to xlate between "foo" and "foo" is just a 
small (and humorous) subset of intended purpose

removing apr_xlate_sb_get() is a change to a stable API; we don't even know if 
there is any such charset for which current single-byte checking will die, 
though of course the iconv handle reset bug needs to be fixed

>>additionally, the single-byte conversion routines need to be reworked to 
>>not depend on the sbcs table, which now is purely an identity table which 
>>is used when the to and from charsets are the same
> 
> 
> With just identity tables remaining, conv_byte is either a nullop or an
> error, so I removed it.  

I would have expected that it would be changed to be a wrapper around the more 
complex API.  Just because APR can't/won't figure out whether a conversion is 
single-byte doesn't mean an application wants to code for the more general case 
in situations where it knows the conversion is single-byte.

Re: cvs commit: apr-util/xlate xlate.c

Posted by Joe Orton <jo...@manyfish.co.uk>.
On Tue, Mar 30, 2004 at 07:05:22AM -0500, Jeff Trawick wrote:
> jorton@apache.org wrote:
> >jorton      2004/03/30 01:58:22
> >
> >  Modified:    xlate    xlate.c
> >  Log:
> >  * xlate/xlate.c (check_sbcs): Remove function which made unsafe
> >  assumptions (a theoretical issue), and was buggy in not resetting the
> >  iconv handle on failure (the cause of at least one real issue).
> >  (apr_xlate_open): Updated caller.
> 
> there is some additional work to be done with this...  I have a patch to 
> yank apr_xlate_sb_get() and modify the various apr-util callers of it, but 
> probably won't get it committed until tonight or tomorrow a.m....

Yeah, I was going to commit the more radical changes separately... 
attached what I have.

> apr-util 0.9 will mark the apr_xlate_sb_get() API deprecated, and of course 
> your fix to reset the iconv handle on failure needs to be in 0.9

Actually I thought that the simple removal of check_sbcs() would be
appropriate for 0.9 too, rather than wait for someone to find a case
where it "does the wrong thing".

> additionally, the single-byte conversion routines need to be reworked to 
> not depend on the sbcs table, which now is purely an identity table which 
> is used when the to and from charsets are the same

With just identity tables remaining, conv_byte is either a nullop or an
error, so I removed it.  But you did want to keep it around for a future
where APR has hard-coded single-byte-lookup tables for some
translations?

> but then the identity table use, which was implemented to piggyback on the 
> sbcs "optimization," should be replaced with memcpy() ;)

Indeed, makes it much simpler :)

joe