You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@subversion.apache.org by "Peter N. Lundblad" <pe...@famlundblad.se> on 2004/08/04 17:39:00 UTC

UTF8 validation

Hi,

We have code to validate UTF-8 and we use it when converting to UTF-8.
But, when we receive UTF-8 (possibly URI-encoded), who makes sure this is
valid? Is this guaranteed by the iconvs that we use?

Regards,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: UTF8 validation

Posted by "Peter N. Lundblad" <pe...@famlundblad.se>.
On Sat, 7 Aug 2004, Greg Hudson wrote:

> On Wed, 2004-08-04 at 13:39, Peter N. Lundblad wrote:
> > We have code to validate UTF-8 and we use it when converting to UTF-8.
> > But, when we receive UTF-8 (possibly URI-encoded), who makes sure this is
> > valid? Is this guaranteed by the iconvs that we use?
>
> I don't think iconv takes care of it, but our utf.c appears to have some
> UTF-8 validation code as of r8581.
>
Yes, but it seems like it is only used when converting *to* UTF-8. So, if
we for example check for "/../" in a server and then convert from UTF-8,
evil clients might use long UTF-8 sequences to represent these characters
and circumvent our checks. If we add validation to the conversions from
UTF-8, then we shouldn't get to the operating system in such cases.

I'll do this if no one objects.

Regards,
//Peter

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: UTF8 validation

Posted by Greg Hudson <gh...@MIT.EDU>.
On Wed, 2004-08-04 at 13:39, Peter N. Lundblad wrote:
> We have code to validate UTF-8 and we use it when converting to UTF-8.
> But, when we receive UTF-8 (possibly URI-encoded), who makes sure this is
> valid? Is this guaranteed by the iconvs that we use?

I don't think iconv takes care of it, but our utf.c appears to have some
UTF-8 validation code as of r8581.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org