You are viewing a plain text version of this content. The canonical link for it is here.

Posted to triplesoup-dev@incubator.apache.org by Joe Schaefer <jo...@sunstarsys.com> on 2007/05/01 02:29:23 UTC

libapreq2?

Although I haven't looked at the codebase yet,
but this looks to me like it will be a fun project.
I'd be happy to help out with some of the stuff
I'm more familiar with, namely integrating Apache-Test
and/or using libapreq2 for parsing inbound POST data.

Have you considered the idea of using libapreq2's parsing
infrastructure for handing POST?  If not, I'd be happy
to try and pitch the advantage of doing so, which
hopefully outweight the problem of introducing a
new dependency.

-- 
Joe Schaefer

Re: libapreq2?

Posted by Joe Schaefer <jo...@sunstarsys.com>.

Joe Schaefer <jo...@sunstarsys.com> writes:

> 2) this code
>
>     if (rawQuery)
>         rawQuery  = apr_pstrcat(r->pool, rawQuery, data, NULL);
>     else
>         rawQuery = (char *)data;
>
> is dangerous for two reasons: the allocator is quadratic (O(n^2)),
> and data may be a freed pointer by the time it's used later in the
> code. One way to fix the allocation issue, I think, is to use a
> doubling algorithm (always allocate twice the current length, and
> track how much is being used), but I haven't tested it, and that's not
> what apreq actually uses. 

I've thought about this a bit more, and the technique I describe
should amount to a total allocation of not more than 4*size of input,
which is O(n) (and therefore ok).  Do we all see the problem yet,
or should I fill in more of the details?

-- 
Joe Schaefer

Re: libapreq2?

Posted by Joe Schaefer <jo...@sunstarsys.com>.

David Reid <da...@jetnet.co.uk> writes:

>> For instance, it took quite a long time to work out
>> the bugs in httpd's header parsing facilities, mainly
>> because it's very easy for good programmers to take
>> a relatively simple task like that and optimize their
>> first crack at the code for simplicity and efficiency
>> over safety and correctness.  I took a brief look 
>> at mod_sparql's implementation and see some of the
>> same problems there.  It wouldn't be hard to fix
>> them, but I really believe we should use apreq's table
>> API instead.
>
> What problems? Be interested to see details...

In the current code, I see two problems:

1) mod_sparql is tokenizing after decoding, when what should
happen is the reverse.  Otherwise an encoded "&&" will
trip things up.

2) this code

                if (rawQuery)
                    rawQuery  = apr_pstrcat(r->pool, rawQuery, data, NULL);
                else
                    rawQuery = (char *)data;

is dangerous for two reasons: the allocator is quadratic (O(n^2)),
and data may be a freed pointer by the time it's used later in the code.
One way to fix the allocation issue, I think, is to use a doubling algorithm 
(always allocate twice the current length, and track how much is being
used), but I haven't tested it, and that's not what apreq actually uses.

-- 
Joe Schaefer

Re: libapreq2?

Posted by David Reid <da...@jetnet.co.uk>.

Joe Schaefer wrote:
> Leo Simons <ma...@leosimons.com> writes:
> 
>> On May 1, 2007, at 4:29 AM, Joe Schaefer wrote:
>>> Although I haven't looked at the codebase yet,
>>> but this looks to me like it will be a fun project.
>>> I'd be happy to help out with some of the stuff
>>> I'm more familiar with, namely integrating Apache-Test
>>> and/or using libapreq2 for parsing inbound POST data.
>> Cool!
>>
>> I've already (guided by Garrett) tried to start doing some Apache-
>> Test usage. I don't really know much about libapreq2 at all.
> 
> Having briefly scanned the sparql-query spec, it looks to
> me like the protocol is basically urlencoded (or xml encoded)
> table data.  If so, that makes apreq a good fit, since
> its accessors are essentially table lookups.
> 
>>> Have you considered the idea of using libapreq2's parsing
>>> infrastructure for handing POST?  If not, I'd be happy
>>> to try and pitch the advantage of doing so, which
>>> hopefully outweight the problem of introducing a
>>> new dependency.
>> Please do! As far as dependencies goes I'm not too fussed really,
>> since it seems we already depend on the stuff libapreq2 uses.
> 
> The main advantage of using apreq is that it will provide
> access to the meat of the sparql-query protocol to any
> apache handler.  Clients of apreq all share the resulting
> parse data, which is in contrast to what usually happens
> inside httpd (first handler to parse it steals the show).
> If there proves to be an opportunity to develop things
> like authorization handlers or output filters
> around the protocol, apreq makes all that relatively
> trivial to do.  If there isn't, and the only thing
> that makes sense is a monolithic mod_sparql handler,
> than it still is a win to use apreq because parsing
> user input in C sucks.  It's better to use a library
> dedicated to the task instead of rolling your own.
> 
> For instance, it took quite a long time to work out
> the bugs in httpd's header parsing facilities, mainly
> because it's very easy for good programmers to take
> a relatively simple task like that and optimize their
> first crack at the code for simplicity and efficiency
> over safety and correctness.  I took a brief look 
> at mod_sparql's implementation and see some of the
> same problems there.  It wouldn't be hard to fix
> them, but I really believe we should use apreq's table
> API instead.

What problems? Be interested to see details...

Re: libapreq2?

Posted by Joe Schaefer <jo...@sunstarsys.com>.

Leo Simons <ma...@leosimons.com> writes:

> On May 1, 2007, at 4:29 AM, Joe Schaefer wrote:
>> Although I haven't looked at the codebase yet,
>> but this looks to me like it will be a fun project.
>> I'd be happy to help out with some of the stuff
>> I'm more familiar with, namely integrating Apache-Test
>> and/or using libapreq2 for parsing inbound POST data.
>
> Cool!
>
> I've already (guided by Garrett) tried to start doing some Apache-
> Test usage. I don't really know much about libapreq2 at all.

Having briefly scanned the sparql-query spec, it looks to
me like the protocol is basically urlencoded (or xml encoded)
table data.  If so, that makes apreq a good fit, since
its accessors are essentially table lookups.

>> Have you considered the idea of using libapreq2's parsing
>> infrastructure for handing POST?  If not, I'd be happy
>> to try and pitch the advantage of doing so, which
>> hopefully outweight the problem of introducing a
>> new dependency.
>
> Please do! As far as dependencies goes I'm not too fussed really,
> since it seems we already depend on the stuff libapreq2 uses.

The main advantage of using apreq is that it will provide
access to the meat of the sparql-query protocol to any
apache handler.  Clients of apreq all share the resulting
parse data, which is in contrast to what usually happens
inside httpd (first handler to parse it steals the show).
If there proves to be an opportunity to develop things
like authorization handlers or output filters
around the protocol, apreq makes all that relatively
trivial to do.  If there isn't, and the only thing
that makes sense is a monolithic mod_sparql handler,
than it still is a win to use apreq because parsing
user input in C sucks.  It's better to use a library
dedicated to the task instead of rolling your own.

For instance, it took quite a long time to work out
the bugs in httpd's header parsing facilities, mainly
because it's very easy for good programmers to take
a relatively simple task like that and optimize their
first crack at the code for simplicity and efficiency
over safety and correctness.  I took a brief look 
at mod_sparql's implementation and see some of the
same problems there.  It wouldn't be hard to fix
them, but I really believe we should use apreq's table
API instead.

-- 
Joe Schaefer

Re: libapreq2?

Posted by Leo Simons <ma...@leosimons.com>.

On May 1, 2007, at 4:29 AM, Joe Schaefer wrote:
> Although I haven't looked at the codebase yet,
> but this looks to me like it will be a fun project.
> I'd be happy to help out with some of the stuff
> I'm more familiar with, namely integrating Apache-Test
> and/or using libapreq2 for parsing inbound POST data.

Cool!

I've already (guided by Garrett) tried to start doing some Apache- 
Test usage. I don't really know much about libapreq2 at all.

> Have you considered the idea of using libapreq2's parsing
> infrastructure for handing POST?  If not, I'd be happy
> to try and pitch the advantage of doing so, which
> hopefully outweight the problem of introducing a
> new dependency.

Please do! As far as dependencies goes I'm not too fussed really,  
since it seems we already depend on the stuff libapreq2 uses.

cheers,

Leo