You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apr.apache.org by "William A. Rowe, Jr." <wr...@rowe-clan.net> on 2005/03/30 00:55:36 UTC

Re: apr-iconv 1.0

This is the original thread we discussed apr-iconv going forward
in 1.0.  It seemed at the time our conclusion was that apr-iconv
would be an internal implementation, not for consumption by the
outside world.

We have an open issue that we are finding some iconv/*.so modules
outside of -our- install, e.g. the gnu iconv, bsd iconv,
apriconv.dll and apriconv-1.dll modules can trip over one another.

Two weeks ago Curt and I kicked around solutions.  Nobody else
responded to the thread, so this post is mostly informative and
for history.

The arguments been made that we should respect APR_ICONV_PATH first,
then historical locations, then use new semantics.  That would NOT
eliminate the conflict.  In hindsite, I'm against adding a special
APR_ICONV_1_PATH, instead, I'll make APR_ICONV_PATH the very last
place we look.

The new layout will use PATH (on win32, if we update for unix,
then LD_LIBRARY_PATH on unix, SHLIB_PATH on hpux) to find the
directory (e.g. apriconv-0 or apriconv-1).  Upon the next release
of httpd, we will entirely eliminate the possibility of colliding
with iconv/.so files from another library provider.

Since we presume apriconv[-1].dll includes {charset}.so files,
any change we make to the .dll will come with an accompanying set
of .so files.  If we change this and roll out a new release, those
installing the new version will find it installs as the new layout
expects.  I can't see a 'binary compatibility' problem there.

If there are any thoughts, please voice them now, as I'll be fixing
the code tomorrow.  The httpd'ers are anxious for a beta, and all
these collisions between log4cxx, svn, httpd etc are going to become
nightmarish.

Bill

At 11:46 AM 7/2/2004, Branko Čibej wrote:
>William A. Rowe, Jr. wrote:
>
>>At 06:41 PM 7/1/2004, Branko Čibej wrote:
>>
>>>Thoughts? I think 1.0 is an auspicious time to make this change, especially if we declare apr-iconv to be an implementation detail of apr_xlate.
>>
>>The nifty bit is, if we declare apr-iconv to be an internal, implementation
>>detail of apr_xlate - we are free to adopt your suggestions in 1.0.1 :)
>That's true.
>
>>What is troubling us most, at this instant, are those things that change
>>the API in such a way that developer's code would be broken fixing the
>>problems of APR 1.0.0.  As long as they are internal details (default
>>pathing, etc) then we won't be troubled by getting it right a little later.
>Then I suggest we really do close off apr-iconv. This means the apr-iconv headers shouldn't get installed, right? Among other things.


Re: apr-iconv 1.0

Posted by Sander Striker <s....@striker.nl>.
William A. Rowe, Jr. wrote:
> At 01:02 PM 3/30/2005, Curt Arnold wrote:
> 
>> If apr-iconv is an implementation detail of apr_xlate, then I don't
>> see the significance of deprecating it.  Either it (or a replacement)
>> is needed to support apr_xlate on platforms that don't provide an
>> iconv or apr_xlate needs to be deprecated or removed.
> 
> We drop it.  Point users 'elsewhere'.  "Not Maintained Here"
> sign on the door.  Tarball moves to archive.apache.org.  Perhaps
> same for svn repository.
> 
> Obviously we want apr_xlate to point at -something- but the what
> would no longer be apr_iconv.
 
Well, do we have a replacement?  If not I don't see how we can seriously
consider dropping apr-iconv at this point.  apr-iconv is doing it's thing.
The first actual discussion in months, if not years, has been the APR_ICONV_PATH
thing, which is easily circumvented with the patch from Steve.  Let's
apply that patch and see from there.  My bet is that either the
discussion drops again and we can go for another year or two, or someone
goes to the trouble of doing the actual replacement implementation
(be that doing platform native calls or otherwise).


Sander

Re: apr-iconv 1.0

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 06:11 PM 3/30/2005, Curt Arnold wrote:
>What are the licensing or technical issues of using the X-licensed ICU4C (http://ibm.com/software/globalization/icu) to implement apr_xlate on platforms that don't have a native iconv?

It's been 2 years since I examined the option.  IIRC it's
c++ code, and is akin to calling out the NY Mets when your
son's little league team looses its pitcher.  Other than
that, I don't know there is any licensing issue or I would
not have even grabbed the download back in the day.

Bill


Re: apr-iconv 1.0

Posted by Curt Arnold <ca...@apache.org>.
What are the licensing or technical issues of using the X-licensed 
ICU4C (http://ibm.com/software/globalization/icu) to implement 
apr_xlate on platforms that don't have a native iconv?


Re: apr-iconv 1.0

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
Colm, grep doesn't help here.

Most of the references you cited are EBCDIC platform-specific
uses of apr_xlate.

Bill

At 04:18 PM 3/30/2005, Colm MacCarthaigh wrote:
>On Wed, Mar 30, 2005 at 03:50:46PM -0600, Curt Arnold wrote:
>> I don't think interpretation 1 could be supported since Subversion and 
>> likely httpd depend on apr_xlate and want to run on Windows platforms.  
>
>This keeps coming up, so - just for the record;
>
>        ./server/util.c
>        ./server/util_charset.c
>        ./server/util_script.c
>        ./server/util_ebcdic.c
>        ./modules/aaa/mod_authnz_ldap.c
>        ./modules/experimental/mod_charset_lite.c
>        ./modules/cache/mod_disk_cache.c
>        ./modules/filters/mod_include.c
>        ./include/util_ebcdic.h
>        ./include/util_charset.h
>        ./support/ab.c
>        ./support/htdbm.c
>        ./support/htdigest.c
>        ./support/htpasswd.c
>
>from httpd depend on apr_xlate. I've had to meddle with apr-iconv to get
>mod_disk_cache working (ish) on win32 in the past, so there are
>definitely some dependencies from httpd.
>
>-- 
>Colm MacCárthaigh                        Public Key: colm+pgp@stdlib.net



Re: apr-iconv 1.0

Posted by Colm MacCarthaigh <co...@stdlib.net>.
On Wed, Mar 30, 2005 at 03:50:46PM -0600, Curt Arnold wrote:
> I don't think interpretation 1 could be supported since Subversion and 
> likely httpd depend on apr_xlate and want to run on Windows platforms.  

This keeps coming up, so - just for the record;

	./server/util.c
	./server/util_charset.c
	./server/util_script.c
	./server/util_ebcdic.c
	./modules/aaa/mod_authnz_ldap.c
	./modules/experimental/mod_charset_lite.c
	./modules/cache/mod_disk_cache.c
	./modules/filters/mod_include.c
	./include/util_ebcdic.h
	./include/util_charset.h
	./support/ab.c
	./support/htdbm.c
	./support/htdigest.c
	./support/htpasswd.c

from httpd depend on apr_xlate. I've had to meddle with apr-iconv to get
mod_disk_cache working (ish) on win32 in the past, so there are
definitely some dependencies from httpd.

-- 
Colm MacCárthaigh                        Public Key: colm+pgp@stdlib.net

Re: apr-iconv 1.0

Posted by Curt Arnold <ca...@apache.org>.
On Apr 7, 2005, at 10:54 AM, William A. Rowe, Jr. wrote:

> At 03:06 PM 4/6/2005, Curt Arnold wrote:
>> The issues that I ran into:
>>
>> 1. apr_iconv dereferences inbytes_left without checking for NULL
>>
>> From the doc comments from apr_xlate_conv_buffer:
>>
>> If the final call is made as suggested, apr_iconv will case a null 
>> pointer exception.  GNU iconv does not.
>
> I'll look.  My concern is that this suggested doc change is part
> of a gnu libiconv-ism, which would break on FreeBSD.  But I need
> to look.

I wasn't suggesting a doc change, the doc is right and doing the final 
call to apr_xlate(&conv, NULL, NULL) is the proper thing to do though 
it is only essential for some fairly obscure encodings.  However, if 
you do the right thing and apr_xlate is using apr_iconv, it will fault.


>
>> 2. apr_iconv does not have a WCHAR_T encoding.
>
> Isn't wchar_t the preference, from ANSI/C99 headers?

The encoding and width of the wchar_t type is platform-dependent.  GNU 
iconv has an encoding named "WCHAR_T" that can be used to convert, for 
example, from a wchar_t* to some other encoding.  Without an "WCHAR_T" 
encoding, my code needs to know that wchar_t* on Win32 is UTF-16LE, 
UCS-4 on some other platform, etc and use the appropriate encoding 
name.


>
>> 3. apr_xlate_open(&convset, APR_LOCALE_CHARSET, ...) fails for common 
>> code pages (like 1252) on Windows
>>
>> This could be an artifact of my hacking.  When this is attempted on 
>> my machine, the code determines the current code page (in my case 
>> 1252, Western European) and creates a corresponding encoding name 
>> like Cp1252.  However, the corresponding encoding module is called 
>> "windows-1252" not "Cp1252".
>>

If you are really interested in tracking these down, I can write the 
test cases.  However just the number of issues I was immediately 
running into was enough to make me rethink my plan on using apr-xlate 
for encoding services on all platforms.	


Re: apr-iconv 1.0

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 03:06 PM 4/6/2005, Curt Arnold wrote:
>The issues that I ran into:
>
>1. apr_iconv dereferences inbytes_left without checking for NULL
>
> From the doc comments from apr_xlate_conv_buffer:
>
>If the final call is made as suggested, apr_iconv will case a null pointer exception.  GNU iconv does not.

I'll look.  My concern is that this suggested doc change is part
of a gnu libiconv-ism, which would break on FreeBSD.  But I need
to look.  

>2. apr_iconv does not have a WCHAR_T encoding.

Isn't wchar_t the preference, from ANSI/C99 headers?

>3. apr_xlate_open(&convset, APR_LOCALE_CHARSET, ...) fails for common code pages (like 1252) on Windows
>
>This could be an artifact of my hacking.  When this is attempted on my machine, the code determines the current code page (in my case 1252, Western European) and creates a corresponding encoding name like Cp1252.  However, the corresponding encoding module is called "windows-1252" not "Cp1252".

I can look.

Bill


Re: apr-iconv 1.0

Posted by Curt Arnold <ca...@apache.org>.
All my previous discussion was speculative on apr-iconv's behavior.  I 
was in the process of integrating use of apr_xlate to support 
specifying the encoding of log files but had not gotten to the point of 
testing at the time of the discussion.

I have ran into several issues when using apr_xlate delegating to (a 
hacked) apr-iconv which I did not encounter when delegating to iconv.  
I don't believe the issues are related to my hacking of the load 
process, but that is always a possibility.  At this point, my plans are 
to use Win32 API calls to support a small set of encodings on Windows 
and apr_xlate on other platforms.

The issues that I ran into:

1. apr_iconv dereferences inbytes_left without checking for NULL

 From the doc comments from apr_xlate_conv_buffer:

  * To correctly terminate the output buffer for some multi-byte
  * character set encodings, a final call must be made to this function
  * after the complete input string has been converted, passing
  * the inbuf and inbytes_left parameters as NULL.  (Note that this
  * mode only works from version 1.1.0 onwards)

If the final call is made as suggested, apr_iconv will case a null 
pointer exception.  GNU iconv does not.

2. apr_iconv does not have a WCHAR_T encoding.

3. apr_xlate_open(&convset, APR_LOCALE_CHARSET, ...) fails for common 
code pages (like 1252) on Windows

This could be an artifact of my hacking.  When this is attempted on my 
machine, the code determines the current code page (in my case 1252, 
Western European) and creates a corresponding encoding name like 
Cp1252.  However, the corresponding encoding module is called 
"windows-1252" not "Cp1252".

The last two may be artifacts of my hacking.  I didn't see any code 
that appeared to alias Cp1252 or WCHAR_T to an available encoding, but 
maybe I missed something.  However, it was enough to shake any 
confidence I had in the approach I was using.


Re: apr-iconv 1.0

Posted by Curt Arnold <ca...@apache.org>.
On Mar 31, 2005, at 1:23 PM, William A. Rowe, Jr. wrote:

> At 12:28 PM 3/31/2005, Curt Arnold wrote:
>
>> On Mar 31, 2005, at 5:27 AM, Mladen Turk wrote:
>>>
>>> I'm sure we can think of something, but if you say that we
>>> can not use native WIN32, then we could at least offer the
>>> option to use gnu-iconv for win32 builds as well, or any other
>>> iconv-api library.
>>
>> How could gnu-iconv be an option for an Apache project due to the 
>> license conflicts?  There don't seem to be any appropriate iconv 
>> implementations for Windows.
>
> Two separate issues.  Apache projects regularly support linking
> to gnu libraries, such as the native install of regex or expat or
> whatever code the GNU communities have co-opted.
>
> I'm against providing GNU-only services from apr, and believe many
> here agree.  I don't agree with inhibiting the same.  The user should
> be able to configure for either, if that's what they want to do.
>
> The difference, we wouldn't distribute a binary linked to GNU code
> from the foundation.  If third parties care to, that's fine.
> I don't have any objection to Mladen's suggestion, as one alternative
> among many.

Mladen's suggestion appeared to be drop apr-iconv and require Windows 
users of apr_xlate to depend on gnu iconv.  Using gnu iconv as an 
option is okay, but it doesn't eliminate the need to provide or 
identify an iconv implementation that is consistent with the Apache 
License.

>
> More to the point, we should have some iconv code base to rely
> on.  A wider community iconv implementation.  Not one which is
> hacked beyond recognition to use the apr interface, which enjoys
> only limited adoption.
>
>> If the only issue is the packaging (modules vs monolithic), that 
>> likely could be addressed by some fancy macro usage and inclusion of 
>> the current code base.  If someone wants me to attempt it, I'd me 
>> willing to give it a shot.  It appears that none of the potential 
>> iconv implementations has a vibrant community and that trading the 
>> dormant apr-iconv community for a dormant FreeBSD iconv community 
>> might not buy us much if the major issue with apr-iconv is packaging.
>
> Those are implementation details.  I've added Mladen to the list
> of individuals interested in seeing a BSD-licensed iconv project.
> Brane was already on that list, and I'm inquiring in the BSD and
> the newlib communities.  Should I add you to that list as an
> interested participant, Curt?
>

I was trying to explain a course of action that could address one of 
the complaints (and maybe the major one) without needing to launch a 
new project or incorporate a new code base so I don't see it as just 
implementation details.  Maybe a new BSD licensed iconv project might 
make that approach moot at some point in the future and we could drop 
apr-iconv, but it exists now and has a degree of familiarity.

I've done a little exploration using macro instead of C++ namespaces 
and the most significant issues is that iconv_ccs_desc and 
iconv_ces_desc atr used both as a structure name and as a instance name 
and I can't use a macro to rename one without affecting the other.  
However, it would be a fairly insignificant change to change to names 
of the local instance variable so the structure and the instance can be 
distinguished.  So unlike the namespace approach, it can't be done with 
no changes to the module code, but it can be done with a small change 
that does not affect their ability to still be used in the module based 
approach.

I'm overbooked as it is.  I am able to make do with the current 
apr-iconv and am only concerned with approaches that might leave me 
stranded without an option on Windows platforms.


Re: apr-iconv 1.0

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 12:28 PM 3/31/2005, Curt Arnold wrote:

>On Mar 31, 2005, at 5:27 AM, Mladen Turk wrote:
>>
>>I'm sure we can think of something, but if you say that we
>>can not use native WIN32, then we could at least offer the
>>option to use gnu-iconv for win32 builds as well, or any other
>>iconv-api library.
>
>How could gnu-iconv be an option for an Apache project due to the license conflicts?  There don't seem to be any appropriate iconv implementations for Windows.

Two separate issues.  Apache projects regularly support linking
to gnu libraries, such as the native install of regex or expat or
whatever code the GNU communities have co-opted.

I'm against providing GNU-only services from apr, and believe many
here agree.  I don't agree with inhibiting the same.  The user should
be able to configure for either, if that's what they want to do.

The difference, we wouldn't distribute a binary linked to GNU code
from the foundation.  If third parties care to, that's fine.
I don't have any objection to Mladen's suggestion, as one alternative
among many.

>>Like said, I think that we should either build our own
>>implementation of iconv functionality, or just build a wrapper
>>over existing one.

Mladen, you didn't define 'we'.  The apr project has proved less
than enthusiastic about creeping featurism.  'We' definitely want
out from iconv support, from the majority of posts I've read.

>>So, I see the future of apr-iconv as ASF implementation of
>>iconv, with iconv abstraction layer moved to apr-utils, that
>>will use apr-iconv, as just one of the flavors of iconv api.
>
>I'm not sure how that is different that the current state.  Currently, apr_xlate in apr-util will use a native iconv implementation if one is detected in configuration, otherwise it will use apr-iconv.  If you don't want to ever use the native iconv implementation, you could probably force that in the configuration script.

More to the point, we should have some iconv code base to rely
on.  A wider community iconv implementation.  Not one which is
hacked beyond recognition to use the apr interface, which enjoys
only limited adoption.

>If the only issue is the packaging (modules vs monolithic), that likely could be addressed by some fancy macro usage and inclusion of the current code base.  If someone wants me to attempt it, I'd me willing to give it a shot.  It appears that none of the potential iconv implementations has a vibrant community and that trading the dormant apr-iconv community for a dormant FreeBSD iconv community might not buy us much if the major issue with apr-iconv is packaging.

Those are implementation details.  I've added Mladen to the list
of individuals interested in seeing a BSD-licensed iconv project.
Brane was already on that list, and I'm inquiring in the BSD and
the newlib communities.  Should I add you to that list as an 
interested participant, Curt?

Bill 


Re: apr-iconv 1.0

Posted by Curt Arnold <ca...@apache.org>.
On Mar 31, 2005, at 5:27 AM, Mladen Turk wrote:

> William A. Rowe, Jr. wrote:
>>>
>>> Well, I'm interested if the final product will not depend on a
>>> hundred or so separate .dll's for each and every charset, and
>>> if apr_util will not depend on apr_iconv.
>> I'd see that as advantage as well.  I understand loadable
>> modules for complex translations, but not for tables.
>>
>
> I would like that we first resolve the static linking of
> apr-iconv to apr-utils, and using apr_dso_* for
> apr_iconv functionality. For me that is the major issue.

For log4cxx, I've replaced the module loading code so that all the 
encodings are present within liblog4cxx.so and/or log4cxx.dll.  To do 
this, I included the module definitions inside C++ namespace blocks to 
prevent symbol collision.  Obviously that is not an appropriate 
approach for apr-iconv itself, but it seems likely that you could do a 
similar thing using preprocessor macros instead of C++ namespaces.


>
> I'm sure we can think of something, but if you say that we
> can not use native WIN32, then we could at least offer the
> option to use gnu-iconv for win32 builds as well, or any other
> iconv-api library.

How could gnu-iconv be an option for an Apache project due to the 
license conflicts?  There don't seem to be any appropriate iconv 
implementations for Windows.


> Like said, I think that we should either build our own
> implementation of iconv functionality, or just build a wrapper
> over existing one.
>
> So, I see the future of apr-iconv as ASF implementation of
> iconv, with iconv abstraction layer moved to apr-utils, that
> will use apr-iconv, as just one of the flavors of iconv api.
>

I'm not sure how that is different that the current state.  Currently, 
apr_xlate in apr-util will use a native iconv implementation if one is 
detected in configuration, otherwise it will use apr-iconv.  If you 
don't want to ever use the native iconv implementation, you could 
probably force that in the configuration script.

If the only issue is the packaging (modules vs monolithic), that likely 
could be addressed by some fancy macro usage and inclusion of the 
current code base.  If someone wants me to attempt it, I'd me willing 
to give it a shot.  It appears that none of the potential iconv 
implementations has a vibrant community and that trading the dormant 
apr-iconv community for a dormant FreeBSD iconv community might not buy 
us much if the major issue with apr-iconv is packaging.


Re: apr-iconv 1.0

Posted by Mladen Turk <mt...@apache.org>.
William A. Rowe, Jr. wrote:
>>
>>Well, I'm interested if the final product will not depend on a
>>hundred or so separate .dll's for each and every charset, and
>>if apr_util will not depend on apr_iconv.
> 
> I'd see that as advantage as well.  I understand loadable
> modules for complex translations, but not for tables.
>

I would like that we first resolve the static linking of
apr-iconv to apr-utils, and using apr_dso_* for
apr_iconv functionality. For me that is the major issue.
Other is that reimplementing either
Konstantin's iconv 2.x as part of apr_iconv would lead to
the same problems. Either we will build something of our
own, and take responsibility for that, or we will build a
wrapper against what ever native iconv implementation
there is. Having duality is IMO a bad thing.

>>Also I think we should consider using
>>MultiByteToWideChar/WideCharToMultiByte on windows as
>>an option to apr_xlate if iconv is not present.
> 
> See my post on this very point.  The API does not lend itself
> whatsoever to streaming data.  Most every APR application is
> dealing with streamed text so this isn't really a useful option.
>

I'm sure we can think of something, but if you say that we
can not use native WIN32, then we could at least offer the
option to use gnu-iconv for win32 builds as well, or any other
iconv-api library.
Like said, I think that we should either build our own
implementation of iconv functionality, or just build a wrapper
over existing one.

So, I see the future of apr-iconv as ASF implementation of
iconv, with iconv abstraction layer moved to apr-utils, that
will use apr-iconv, as just one of the flavors of iconv api.

Regards,
Mladen.

Re: apr-iconv 1.0

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 12:23 AM 3/31/2005, Mladen Turk wrote:
>William A. Rowe, Jr. wrote:
>>FWIW - who else in this community wants to participate if such
>>an ASL/BSD licensed iconv project grew up around the most current
>>code for iconv?
>
>Well, I'm interested if the final product will not depend on a
>hundred or so separate .dll's for each and every charset, and
>if apr_util will not depend on apr_iconv.

I'd see that as advantage as well.  I understand loadable
modules for complex translations, but not for tables.

>Also I think we should consider using
>MultiByteToWideChar/WideCharToMultiByte on windows as
>an option to apr_xlate if iconv is not present.

See my post on this very point.  The API does not lend itself
whatsoever to streaming data.  Most every APR application is
dealing with streamed text so this isn't really a useful option.





Re: apr-iconv 1.0

Posted by Mladen Turk <mt...@apache.org>.
William A. Rowe, Jr. wrote:
> 
> FWIW - who else in this community wants to participate if such
> an ASL/BSD licensed iconv project grew up around the most current
> code for iconv?
>

Well, I'm interested if the final product will not depend on a
hundred or so separate .dll's for each and every charset, and
if apr_util will not depend on apr_iconv.

Also I think we should consider using
MultiByteToWideChar/WideCharToMultiByte on windows as
an option to apr_xlate if iconv is not present.

Regards,
Mladen.

Re: apr-iconv 1.0

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 03:50 PM 3/30/2005, Curt Arnold wrote:

>>Well, subversion does use apr_xlate.  But this was actually reported
>>by our log4cxx devs, who tripped over the subversion copy when they
>>thought they were using their own build.
>
>I made the initial report when attempting to use apr_xlate in log4cxx.

Yup

>What I saw trying to say was that Subversion would likely have not set APR_ICONV_PATH in Windows unless they were using apr_xlate.  

Which they do.

>2. Drop apr_iconv and support apr_xlate on platforms without a native iconv implementation using something else.

Yes.  Pick up the FreeBSD iconv 2.x with some win32 or other
atypical unix deployment patches.

>I could support option 2, but would need to know the identity and licensing terms of the "something else" first.

Again, FreeBSD iconv 2.x is maintained by Alexander, who hasn't
had the cycles himself to devote to it, or a community around it.
So, proposing a new community around it, either through the
incubator with the ASL license, or still under the BSD license
somewhere else (e.g. a sourceforge type location.)  There are
several communities who rely on the Kostantin implementation.
(FWIW - he's dropped his advertising clause in a note to Alexander,
but the incubator would probably wish to have something more
affirmative from him.)

I don't suggest we drop it from support in apr-util 1.0, but the
next major point bump of apr-util.  Provided it gels.

FWIW - who else in this community wants to participate if such
an ASL/BSD licensed iconv project grew up around the most current
code for iconv?

Bill





Re: apr-iconv 1.0

Posted by Curt Arnold <ca...@apache.org>.
On Mar 30, 2005, at 2:40 PM, William A. Rowe, Jr. wrote:

> At 01:02 PM 3/30/2005, Curt Arnold wrote:
>
>> If apr-iconv is an implementation detail of apr_xlate, then I don't 
>> see the significance of deprecating it.  Either it (or a replacement) 
>> is needed to support apr_xlate on platforms that don't provide an 
>> iconv or apr_xlate needs to be deprecated or removed.
>
> We drop it.  Point users 'elsewhere'.  "Not Maintained Here"
> sign on the door.  Tarball moves to archive.apache.org.  Perhaps
> same for svn repository.
>
> Obviously we want apr_xlate to point at -something- but the what
> would no longer be apr_iconv.
>
>> I assume that Subversion is actually using apr_xlate or they would 
>> not have bothered with setting APR_ICONV_PATH.  Does httpd use 
>> apr_xlate?
>> If so, I don't see how you could deprecate apr_xlate or make it only 
>> available on Unix platforms.
>
> Well, subversion does use apr_xlate.  But this was actually reported
> by our log4cxx devs, who tripped over the subversion copy when they
> thought they were using their own build.
>

I made the initial report when attempting to use apr_xlate in log4cxx.

What I saw trying to say was that Subversion would likely have not set 
APR_ICONV_PATH in Windows unless they were using apr_xlate.  So 
removing apr_xlate on Windows would likely break Subversion on Windows. 
  It may also break or reduce the functionality of httpd on Windows, but 
that is speculation.  I would not see dropping apr_xlate on platforms 
that don't have a native iconv implementation as acceptable.

I could see several potential interpretations of "deprecating 
apr-iconv" and the proposal needs to be clarified for a vote to be 
meaningful.  Here some of the potential interpretations:

1. Drop apr_iconv and suppress apr_xlate on platforms without a native 
iconv implementation.

2. Drop apr_iconv and support apr_xlate on platforms without a native 
iconv implementation using something else.

3. Drop apr_iconv and suppress apr_xlate on platforms without a native 
iconv implementation, but add another translation method that can be 
supported by the Windows API and iconv.

I don't think interpretation 1 could be supported since Subversion and 
likely httpd depend on apr_xlate and want to run on Windows platforms.  
Interpretation 2 would depend on the nature of the "something else".  I 
assume the minimum requirements would be support on Windows and an 
Apache compatible license.  I would assume that there would be 
resistance to interpretation 3.

I could support option 2, but would need to know the identity and 
licensing terms of the "something else" first.


Re: apr-iconv 1.0

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 01:02 PM 3/30/2005, Curt Arnold wrote:

>If apr-iconv is an implementation detail of apr_xlate, then I don't see the significance of deprecating it.  Either it (or a replacement) is needed to support apr_xlate on platforms that don't provide an iconv or apr_xlate needs to be deprecated or removed.

We drop it.  Point users 'elsewhere'.  "Not Maintained Here"
sign on the door.  Tarball moves to archive.apache.org.  Perhaps
same for svn repository.

Obviously we want apr_xlate to point at -something- but the what
would no longer be apr_iconv.

>I assume that Subversion is actually using apr_xlate or they would not have bothered with setting APR_ICONV_PATH.  Does httpd use apr_xlate?  
>If so, I don't see how you could deprecate apr_xlate or make it only available on Unix platforms.

Well, subversion does use apr_xlate.  But this was actually reported
by our log4cxx devs, who tripped over the subversion copy when they
thought they were using their own build.


Re: apr-iconv 1.0

Posted by Curt Arnold <ca...@houston.rr.com>.
On Mar 30, 2005, at 12:41 PM, William A. Rowe, Jr. wrote:
> Should we deprecate our apr-iconv as of apr-util 2.0?
>
> +1: wrowe
> -1:
>
> The resulting issue, plug the hole before apr-util 2.0 is ready.
> Until that happens, if the vote is affirmative, I'll be happy
> to modify apr.hw to reflect that apr_xlate is not available on
> Win32.
>

If apr-iconv is an implementation detail of apr_xlate, then I don't see 
the significance of deprecating it.  Either it (or a replacement) is 
needed to support apr_xlate on platforms that don't provide an iconv or 
apr_xlate needs to be deprecated or removed.

I assume that Subversion is actually using apr_xlate or they would not 
have bothered with setting APR_ICONV_PATH.  Does httpd use apr_xlate?  
If so, I don't see how you could deprecate apr_xlate or make it only 
available on Unix platforms.


Re: apr-iconv 1.0

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 10:17 AM 3/30/2005, Mladen Turk wrote:
>William A. Rowe, Jr. wrote:
>
>I've developed libxlate library about a year ago, so perhaps
>it deserves a second look. It can be integrated within apr-iconv,
>because it follows the iconv api.
>Since this is my code, I'm willing to donate it to ASF if
>accepted for apr-iconv.

The biggest issue is community.

apr-iconv clearly hasn't enjoyed enough community to retain the
code we have.  Contributions and help are scattershot.  Our choice
to integrate to apr was a poor one, because it doesn't enjoy any
adoption outside of the apr user community.  Simply, our solution
was overkill, and formatting changes made it difficult to maintain
in sync with the wider community.

libxlate, while it sounds like a slick idea, suffers the same issue.
One key point Roy raises is that we shouldn't as a project suffer
a dependency on a single individual.  You would need to first submit
this library to the incubator, and develop a community.

I dare say the apr community isn't interested in the nuts-and-bolts
of iconv/xlate/charset management.  We weren't even interested in
continuing serf, it moved from the apr project.  You might also
consider the apache commons project as a container for libxlate.

One concern, you mention you are offering this code under the ASL2,
and mention that it follows the GNU single-dll, table-based solution.
Did you look at the GNU implementation?

I've started a dialog with the BSD-licensed iconv community
principals to rebuild some traction for that library.  The Linux
community couldn't care less, as iconv has been moving into clib.
But those not using the gcc compiler will continue to experience
issues with iconv support.

Once your libxlate is an established project, I see no reason not
to set up the detection and configuration code to allow it from
apr-util.  Choices are good.

I'll submit this to vote now, anticipating we want some resolution
but don't want to break existing users:

Should we deprecate our apr-iconv as of apr-util 2.0?

+1: wrowe
-1: 

The resulting issue, plug the hole before apr-util 2.0 is ready.
Until that happens, if the vote is affirmative, I'll be happy
to modify apr.hw to reflect that apr_xlate is not available on
Win32.

Bill


Re: apr-iconv 1.0

Posted by Mladen Turk <mt...@apache.org>.
William A. Rowe, Jr. wrote:

 > This is the original thread we discussed apr-iconv going forward
 > in 1.0.  It seemed at the time our conclusion was that apr-iconv
 > would be an internal implementation, not for consumption by the
 > outside world.
 >

Well, not sure if I already posted this, but anyhow.

I've developed libxlate library about a year ago, so perhaps
it deserves a second look. It can be integrated within apr-iconv,
because it follows the iconv api.
Since this is my code, I'm willing to donate it to ASF if
accepted for apr-iconv.
Unlike Konstantin's implementation, mine follows the gnu iconv
concept with all data tables in a single file (.dll), and has
some features like case folding, transliteration, etc...

The source can be found at:
http://people.apache.org/~mturk/libxlate.zip

Regards,
Mladen.


Re: apr-iconv 1.0

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 05:12 PM 3/29/2005, Roy T.Fielding wrote:
>IMO, delete apr-iconv and let installers use the system-supplied
>library.

Respectfully, I don't think we can do so in apr-util 1.x.

The apr-util 2.0 release is the next major bump for deprecating 
interfaces, and we would have to alert users that it's as likely
as not that apr_xlate will return APR_ENOTIMPL more often than
it had previously done.

Nothing is stopping us from moving up to apr-util 2.0 tomorrow,
of course.

Now let me explain why we can't do this using Win32's internal
charset translation.  apr_xlate's api was too transparent around
iconv.  The Win32 API uses an all-or-nothing approach, returning
the number of bytes required to represent the translated text if
the results would not fit in the user supplied buffer.

The iconv implementation will perform a partial translation,
incrementing both the source and destination string pointers to
the end of the successfully translated text.

Because the source and destination code pages may match in bytes
per character, may be a multiple or fraction (e.g. ascii<>unicode)
or even multibyte representations, there is no way for the Win32
implementation to reinvoke w/the appropriate length of the source
text, except by backing up one byte at a time till we have a fit.

I understand there is a COM equivalent on Win32 that does perform
partial translations, but adding the COM overhead would be painful,
and surely less efficient than fitting iconv on win32.

My bottom-line proposal is to support iconv 2.0.3 on Win32, in the
framework of an iconv project.  I'm soliciting input from the
various platform maintainers of that long-neglected code.  Again
I don't see this change as realistic until apr-util 2.0.

Bill


Re: apr-iconv 1.0

Posted by Branko Čibej <br...@xbc.nu>.
William A. Rowe, Jr. wrote:

>At 06:04 PM 3/29/2005, Branko ÄŒibej wrote:
>
>  
>
>>The "best" solution would be to upgrade apr-iconv based on the iconv2
>>library, which doesn't use shared libraries for the charset files any
>>longer (it uses generated binary files in a different format).
>>    
>>
>
>This being the freebsd/newlib iconv 2.03?  We have to find something
>compatible with the BSD license. 
>
I think so. It's a new version of the library we're based on already, by 
The Same Author, and it looks like it's the same license.

-- Brane


Re: apr-iconv 1.0

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 06:04 PM 3/29/2005, Branko Čibej wrote:

>The "best" solution would be to upgrade apr-iconv based on the iconv2
>library, which doesn't use shared libraries for the charset files any
>longer (it uses generated binary files in a different format).

This being the freebsd/newlib iconv 2.03?  We have to find something
compatible with the BSD license. 


Re: apr-iconv 1.0

Posted by Branko Čibej <br...@xbc.nu>.
Roy T. Fielding wrote:

> IMO, delete apr-iconv and let installers use the system-supplied
> library.

Don't I wish... Except that there is no system-supplied library on
Windows, which currently seems to be the primary target for apr-iconv.
We use the system libiconv on Unix wherever it's available.

The "best" solution would be to upgrade apr-iconv based on the iconv2
library, which doesn't use shared libraries for the charset files any
longer (it uses generated binary files in a different format).

-- Brane



Re: apr-iconv 1.0

Posted by "Roy T. Fielding" <fi...@gbiv.com>.
IMO, delete apr-iconv and let installers use the system-supplied
library.

....Roy


Re: apr-iconv 1.0

Posted by SteveKing <st...@gmx.ch>.
William A. Rowe, Jr. wrote:

>   This is exactly my desired solution.  If you would like to
> submit your patch to the list (submissions by reference aren't
> accepted without extra hassles) I would very much like to adopt 
> it.  (Especially as opposed to reinventing it.)

Ok, patch attached.

You can also find the complete file TSVN uses here:
<https://svn.collab.net/repos/tortoisesvn/trunk/ext/apr-iconv_patch/lib/iconv_module.c>

Stefan

-- 
        ___
   oo  // \\      "De Chelonian Mobile"
  (_,\/ \_/ \     TortoiseSVN
    \ \_/_\_/>    The coolest Interface to (Sub)Version Control
    /_/   \_\     http://tortoisesvn.tigris.org

Re: apr-iconv 1.0

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
Steve,

  This is exactly my desired solution.  If you would like to
submit your patch to the list (submissions by reference aren't
accepted without extra hassles) I would very much like to adopt 
it.  (Especially as opposed to reinventing it.)

  My proposal of LD_LIBRARY_PATH corresponds to the fact that
Unix doesn't have the GetModuleFileName concept to figure out
where the absolute path of the libapriconv-1.so module lies.
Since we have to make exceptions for Win32, and HP/UX, I see
no reason not to use the native solution to Win32.  

  Incidentally, do we need to search DYLD_LIBRARY_PATH or 
LD_LIBRARY_PATH on OSX?

Bill

At 09:49 AM 3/30/2005, SteveKing wrote:
>Branko Čibej wrote:
>>William A. Rowe, Jr. wrote:
>>
>>>At 12:36 AM 3/30/2005, Curt Arnold wrote:
>>> 
>>>
>>>>However, it would be good if we could get some comment from a Subversion developer.
>>>>  
>>>
>>>Agreed, and Branko's watching this thread.
>>> 
>>Yup.
>>It might also be a good thing to ping Steve King of TortoiseSVN fame; TSVN also ships a hacked version of apr-iconv, I believe the hack is to ignore APR_ICONV_PATH. The issue IIRC is collision between memory allocators in different runtime librararies on Windows; but anyway, Steve should be able to explain.
>
>The reasons why TSVN ships a patched version of apr-iconv:
>
>- the *.so files are actually dll's which require a specific C-runtime. This isn't a big problem if APR_ICONV_PATH points to a iconv compiled with VC6. But if it points to an iconv compiled with >VC7, then programs not using the new c-runtime (> version 7) won't run anymore. A 'solution' to that problem would be to install the runtime in e.g. the system dir on windows, but even MS tells to _not_ do that.
>
>- every program which installs its own iconv lib might change the APR_ICONV_PATH variable (believe me: most programs _don't_ check if it's already set) and overwrite this with maybe an older version of the lib!
>
>- looking up the env variable isn't very fast, i.e. it takes more time to load the *.so modules than other methods.
>
>The changes I made to apr-iconv for TSVN:
>
>- add a DllMain() to iconv_module which caches the path to the *.so modules (i.e. the path is determined only once when the iconv dll is loaded, not for every *.so module) - this is a _very_ big speed improvement for Subversion which makes intensive use of *.so modules.
>
>- the path is determined like this now:
>  (in the following description, THISPATH refers to the path where the libapriconv.dll is located, fetched via GetModuleFileName() API)
>  1. path exists THISPATH\iconv
>  2. path exists ..\THISPATH\iconv
>  3. use APR_ICONV_PATH
>  by doing this, it's assured that a program like TSVN always uses the iconv lib it has installed itself. Only if it doesn't install the iconv lib itself or it can't be found in 1) or 2), then the APR_ICONV_PATH env variable is used to find the *.so modules.
>
>You can find the patched iconv_module.c file here:
>https://svn.collab.net/repos/tortoisesvn/trunk/ext/apr-iconv_patch/lib/iconv_module.c
>
>Ever since TSVN ships with this patched iconv version, we haven't had any more problems with other Subversion clients interfering or missing c-runtime dll's or memory leaks due to incompatible c-runtimes (according to MS, the version 6 and 7 of the c-runtime are not compatible and use different memory allocators which leads to memory leaks and even heap corruptions if a program uses the wrong version).
>
>So I'm not sure what you guys have planned to change in apr-iconv, but I strongly recommend that if you change something, get rid of the APR_ICONV_PATH (on windows at least), or at least use it as the last option like TSVN does now.
>
>Stefan
>
>-- 
>       ___
>  oo  // \\      "De Chelonian Mobile"
> (_,\/ \_/ \     TortoiseSVN
>   \ \_/_\_/>    The coolest Interface to (Sub)Version Control
>   /_/   \_\     http://tortoisesvn.tigris.org



Re: apr-iconv 1.0

Posted by SteveKing <st...@gmx.ch>.
Branko Čibej wrote:
> SteveKing wrote:
> 
>> The reasons why TSVN ships a patched version of apr-iconv:
> 
> 
> Thanks for the exhaustive report, Stefan.
> 
> [...]
> 
>>   2. path exists ..\THISPATH\iconv
> 
> 
> Surely you mean THISPATH\..\iconv?

Yep! Sorry for the confusion.

Stefan

-- 
        ___
   oo  // \\      "De Chelonian Mobile"
  (_,\/ \_/ \     TortoiseSVN
    \ \_/_\_/>    The coolest Interface to (Sub)Version Control
    /_/   \_\     http://tortoisesvn.tigris.org

Re: apr-iconv 1.0

Posted by Branko Čibej <br...@xbc.nu>.
SteveKing wrote:

> The reasons why TSVN ships a patched version of apr-iconv:

Thanks for the exhaustive report, Stefan.

[...]

>   2. path exists ..\THISPATH\iconv

Surely you mean THISPATH\..\iconv?

 
-- Brane


Re: apr-iconv 1.0

Posted by SteveKing <st...@gmx.ch>.
Branko Čibej wrote:
> William A. Rowe, Jr. wrote:
> 
>> At 12:36 AM 3/30/2005, Curt Arnold wrote:
>>  
>>
>>> However, it would be good if we could get some comment from a 
>>> Subversion developer.
>>>   
>>
>> Agreed, and Branko's watching this thread.
>>  
>>
> Yup.
> 
> It might also be a good thing to ping Steve King of TortoiseSVN fame; 
> TSVN also ships a hacked version of apr-iconv, I believe the hack is to 
> ignore APR_ICONV_PATH. The issue IIRC is collision between memory 
> allocators in different runtime librararies on Windows; but anyway, 
> Steve should be able to explain.

The reasons why TSVN ships a patched version of apr-iconv:

- the *.so files are actually dll's which require a specific C-runtime. 
This isn't a big problem if APR_ICONV_PATH points to a iconv compiled 
with VC6. But if it points to an iconv compiled with >VC7, then programs 
not using the new c-runtime (> version 7) won't run anymore. A 
'solution' to that problem would be to install the runtime in e.g. the 
system dir on windows, but even MS tells to _not_ do that.

- every program which installs its own iconv lib might change the 
APR_ICONV_PATH variable (believe me: most programs _don't_ check if it's 
already set) and overwrite this with maybe an older version of the lib!

- looking up the env variable isn't very fast, i.e. it takes more time 
to load the *.so modules than other methods.

The changes I made to apr-iconv for TSVN:

- add a DllMain() to iconv_module which caches the path to the *.so 
modules (i.e. the path is determined only once when the iconv dll is 
loaded, not for every *.so module) - this is a _very_ big speed 
improvement for Subversion which makes intensive use of *.so modules.

- the path is determined like this now:
   (in the following description, THISPATH refers to the path where the 
libapriconv.dll is located, fetched via GetModuleFileName() API)
   1. path exists THISPATH\iconv
   2. path exists ..\THISPATH\iconv
   3. use APR_ICONV_PATH
   by doing this, it's assured that a program like TSVN always uses the 
iconv lib it has installed itself. Only if it doesn't install the iconv 
lib itself or it can't be found in 1) or 2), then the APR_ICONV_PATH env 
variable is used to find the *.so modules.

You can find the patched iconv_module.c file here:
https://svn.collab.net/repos/tortoisesvn/trunk/ext/apr-iconv_patch/lib/iconv_module.c

Ever since TSVN ships with this patched iconv version, we haven't had 
any more problems with other Subversion clients interfering or missing 
c-runtime dll's or memory leaks due to incompatible c-runtimes 
(according to MS, the version 6 and 7 of the c-runtime are not 
compatible and use different memory allocators which leads to memory 
leaks and even heap corruptions if a program uses the wrong version).

So I'm not sure what you guys have planned to change in apr-iconv, but I 
strongly recommend that if you change something, get rid of the 
APR_ICONV_PATH (on windows at least), or at least use it as the last 
option like TSVN does now.

Stefan

-- 
        ___
   oo  // \\      "De Chelonian Mobile"
  (_,\/ \_/ \     TortoiseSVN
    \ \_/_\_/>    The coolest Interface to (Sub)Version Control
    /_/   \_\     http://tortoisesvn.tigris.org

Re: apr-iconv 1.0

Posted by Branko Čibej <br...@xbc.nu>.
William A. Rowe, Jr. wrote:

>At 12:36 AM 3/30/2005, Curt Arnold wrote:
>  
>
>>However, it would be good if we could get some comment from a Subversion developer.
>>    
>>
>Agreed, and Branko's watching this thread.
>  
>
Yup.

It might also be a good thing to ping Steve King of TortoiseSVN fame; 
TSVN also ships a hacked version of apr-iconv, I believe the hack is to 
ignore APR_ICONV_PATH. The issue IIRC is collision between memory 
allocators in different runtime librararies on Windows; but anyway, 
Steve should be able to explain.

-- Brane


Re: apr-iconv 1.0

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 12:36 AM 3/30/2005, Curt Arnold wrote:

>On Mar 29, 2005, at 11:13 PM, William A. Rowe, Jr. wrote:
>
>>At 10:09 PM 3/29/2005, Curt Arnold wrote:
>>>I'm unclear on the proposed resolution.  Does it include a mechanism to distinguish apriconv-1 encoding modules from apriconv-0 modules either by changing the module signature or name?
>>
>>I believe we should do that, too.  But it seems that some want
>>us to use an 'external' iconv implementation, at least, as of
>>the next major release of apr-util.  With the files residing
>>in /apriconv-1/*.so I'm less concerned with the signature.
>
>Are you expecting the apriconv-1 directory to be on the PATH or LD_LIBRARY_PATH 

yes.

>or relative to a directory on the PATH or LD_LIBRARY_PATH?  

Well the directory apriconv-1 will reside in a directory pointed
to by PATH/LD_LIBRARY_PATH/SHLIB_PATH.  And (at least on win32) 
within the cwd as we also discover binaries that way.

The theory is simple.  We -found- libapriconv-1.so/.dll.  We'd
already loaded it.  How?  Because it resided in exactly those 
paths, or the subdirectory of the startup dir on win32.

The trivial rule is that libapriconv-1/ directory must sit beside
libapriconv-1.so/.dll.

>If it is on the PATH or LD_LIBRARY_PATH, then this is the continued possibility of collision with modules from other iconv implementations, if it is relative to an directory on the path, then I would expect that you would need add a decent amount of complexity to accomplish that.  

Not really.  We concatenate two strings, or three.

>I would think it may be simpler to have modules named apriconv-1-ENCODING.so or .dll and have dlopen or LoadLibrary find them with their existing library search mechanisms, but I could understand if that was undesirable for other reasons.

Hmmm.  String concats are string concats.  Visually, I believed
a directory libapriconv-1 was more clear.  I don't think adding
the extra filename information buys us anything, since -all- iconv
implementations have always shared the concept of a separate dir
for loadable modules.

>>>Without some mechanism to distinguish encoding modules, it really doesn't matter what order the paths are evaluated?  If there is a mechanism, it shouldn't hurt to evaluate APR_ICONV_PATH first?
>>
>>Of course it would, because we would still have to iterate
>>multiple .so files to find the correct one.  This is a huge
>>performance hit.
>>
>>So let me inquire - have you authored a custom (charset).so file,
>>or intend to do so?
>
>No, I have no intention to do that.  At this point, the resolution to the issue does not effect log4cxx since it will continue to hack apriconv to embed the encodings into log4cxx.dll and liblog4cxx.so.  
>You need to do what is right for httpd and/or Subversion.

Ack.  And I hope - do something that log4cxx ultimately adopts.

>>Again, the consensus on list was that apr-iconv
>>was an implementation-internal detail of apr-util.  With the next
>>release, implementors/users are expected to install the directory
>>apriconv-1 in parallel to apriconv-1.so [.dll].  APR_ICONV_PATH
>>would be searched last, to pick up those exceptions/custom charset
>>modules.  We wouldn't hit iconv/ at all (unless the user asks us
>>to through APR_ICONV_PATH.
>
>Since Subversion sets APR_ICONV_PATH, there is still the possibility of finding an apriconv-0 encoding module if the installation is damaged (or the developer decides not to distribute some obscure encoding) or a new encoding gets added to Subversion and/or apriconv-0.

Right, except that we would seek APR_ICONV_PATH last.  Minimize
the chance of a collision.  But certainly, not exclude every
possibility.

>>So I'm confused if this solution doesn't resolve the issues of all
>>users.  Please explain further.
>>
>>On your point of avoiding conflicts by changing the signature, yes
>>I would be -happy- to entertain a patch, if you want to post one in
>>the next day.  I'll be putting together my changes tomorrow eve,
>>and would be easily able to integrate your suggestion if you care
>>to offer the patch.  I don't see it as a substitute for searching
>>for the right file, but I see it as added security against loading
>>the wrong module.
>
>By changing the signature, I meant changing the value of ICMOD_UC_CCS and ICMOD_UC_CES (currently 0x100 and 0x101).  

Ack.

>I don't know if this method of module identification is used elsewhere and if so, what other values are currently in use.  If the encoding modules can't be distinguished by name, I think it is essential that they have a different magic value.  Otherwise, you are just reducing the chances of a catastrophic failure by checking APR_ICONV_PATH last, not eliminating it.

Yes.

>Putting myself in the position of the developer of some hypothetical app (and I couldn't have an iconv2 based solution), I would prefer that apriconv-1 encoding modules be named something like "apriconv-1-utf-8.dll" or "libapriconv-1-utf-8.so" and located using the established library search algorithms in dlopen or LoadLibrary.  In deployment, these would likely be placed in the same directory as aprutil-1.dll or libaprutil-1.so or the executable if linked against a static aprutil-1.  APR_ICONV_PATH could be ignored.  If the user really wanted a new encoding, they could either place it in an already searched directory or add the directory to PATH or LD_LIBRARY_PATH.
>
>However, it would be good if we could get some comment from a Subversion developer.

Agreed, and Branko's watching this thread.

Bill


Re: apr-iconv 1.0

Posted by Curt Arnold <ca...@apache.org>.
On Mar 29, 2005, at 11:13 PM, William A. Rowe, Jr. wrote:

> At 10:09 PM 3/29/2005, Curt Arnold wrote:
>> I'm unclear on the proposed resolution.  Does it include a mechanism 
>> to distinguish apriconv-1 encoding modules from apriconv-0 modules 
>> either by changing the module signature or name?
>
> I believe we should do that, too.  But it seems that some want
> us to use an 'external' iconv implementation, at least, as of
> the next major release of apr-util.  With the files residing
> in /apriconv-1/*.so I'm less concerned with the signature.

Are you expecting the apriconv-1 directory to be on the PATH or 
LD_LIBRARY_PATH or relative to a directory on the PATH or 
LD_LIBRARY_PATH?  If it is on the PATH or LD_LIBRARY_PATH, then this is 
the continued possibility of collision with modules from other iconv 
implementations, if it is relative to an directory on the path, then I 
would expect that you would need add a decent amount of complexity to 
accomplish that.  I would think it may be simpler to have modules named 
apriconv-1-ENCODING.so or .dll and have dlopen or LoadLibrary find them 
with their existing library search mechanisms, but I could understand 
if that was undesirable for other reasons.

>
>> Without some mechanism to distinguish encoding modules, it really 
>> doesn't matter what order the paths are evaluated?  If there is a 
>> mechanism, it shouldn't hurt to evaluate APR_ICONV_PATH first?
>
> Of course it would, because we would still have to iterate
> multiple .so files to find the correct one.  This is a huge
> performance hit.
>
> So let me inquire - have you authored a custom (charset).so file,
> or intend to do so?

No, I have no intention to do that.  At this point, the resolution to 
the issue does not effect log4cxx since it will continue to hack 
apriconv to embed the encodings into log4cxx.dll and liblog4cxx.so.  
You need to do what is right for httpd and/or Subversion.


> Again, the consensus on list was that apr-iconv
> was an implementation-internal detail of apr-util.  With the next
> release, implementors/users are expected to install the directory
> apriconv-1 in parallel to apriconv-1.so [.dll].  APR_ICONV_PATH
> would be searched last, to pick up those exceptions/custom charset
> modules.  We wouldn't hit iconv/ at all (unless the user asks us
> to through APR_ICONV_PATH.

Since Subversion sets APR_ICONV_PATH, there is still the possibility of 
finding an apriconv-0 encoding module if the installation is damaged 
(or the developer decides not to distribute some obscure encoding) or a 
new encoding gets added to Subversion and/or apriconv-0.

>
> So I'm confused if this solution doesn't resolve the issues of all
> users.  Please explain further.
>
> On your point of avoiding conflicts by changing the signature, yes
> I would be -happy- to entertain a patch, if you want to post one in
> the next day.  I'll be putting together my changes tomorrow eve,
> and would be easily able to integrate your suggestion if you care
> to offer the patch.  I don't see it as a substitute for searching
> for the right file, but I see it as added security against loading
> the wrong module.
>

By changing the signature, I meant changing the value of ICMOD_UC_CCS 
and ICMOD_UC_CES (currently 0x100 and 0x101).  I don't know if this 
method of module identification is used elsewhere and if so, what other 
values are currently in use.  If the encoding modules can't be 
distinguished by name, I think it is essential that they have a 
different magic value.  Otherwise, you are just reducing the chances of 
a catastrophic failure by checking APR_ICONV_PATH last, not eliminating 
it.

Putting myself in the position of the developer of some hypothetical 
app (and I couldn't have an iconv2 based solution), I would prefer that 
apriconv-1 encoding modules be named something like 
"apriconv-1-utf-8.dll" or "libapriconv-1-utf-8.so" and located using 
the established library search algorithms in dlopen or LoadLibrary.  In 
deployment, these would likely be placed in the same directory as 
aprutil-1.dll or libaprutil-1.so or the executable if linked against a 
static aprutil-1.  APR_ICONV_PATH could be ignored.  If the user really 
wanted a new encoding, they could either place it in an already 
searched directory or add the directory to PATH or LD_LIBRARY_PATH.

However, it would be good if we could get some comment from a 
Subversion developer.



Re: apr-iconv 1.0

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 10:09 PM 3/29/2005, Curt Arnold wrote:
>I'm unclear on the proposed resolution.  Does it include a mechanism to distinguish apriconv-1 encoding modules from apriconv-0 modules either by changing the module signature or name? 

I believe we should do that, too.  But it seems that some want
us to use an 'external' iconv implementation, at least, as of
the next major release of apr-util.  With the files residing
in /apriconv-1/*.so I'm less concerned with the signature.

>Without some mechanism to distinguish encoding modules, it really doesn't matter what order the paths are evaluated?  If there is a mechanism, it shouldn't hurt to evaluate APR_ICONV_PATH first?

Of course it would, because we would still have to iterate
multiple .so files to find the correct one.  This is a huge
performance hit.

So let me inquire - have you authored a custom (charset).so file,
or intend to do so?  Again, the consensus on list was that apr-iconv
was an implementation-internal detail of apr-util.  With the next
release, implementors/users are expected to install the directory
apriconv-1 in parallel to apriconv-1.so [.dll].  APR_ICONV_PATH
would be searched last, to pick up those exceptions/custom charset
modules.  We wouldn't hit iconv/ at all (unless the user asks us
to through APR_ICONV_PATH.

So I'm confused if this solution doesn't resolve the issues of all
users.  Please explain further.

On your point of avoiding conflicts by changing the signature, yes
I would be -happy- to entertain a patch, if you want to post one in
the next day.  I'll be putting together my changes tomorrow eve,
and would be easily able to integrate your suggestion if you care
to offer the patch.  I don't see it as a substitute for searching
for the right file, but I see it as added security against loading
the wrong module.

Bill


Re: apr-iconv 1.0

Posted by Curt Arnold <ca...@apache.org>.
I'm unclear on the proposed resolution.  Does it include a mechanism to 
distinguish apriconv-1 encoding modules from apriconv-0 modules either 
by changing the module signature or name? 	Without some mechanism to 
distinguish encoding modules, it really doesn't matter what order the 
paths are evaluated?  If there is a mechanism, it shouldn't hurt to 
evaluate APR_ICONV_PATH first?