You are viewing a plain text version of this content. The canonical link for it is here.

Posted to c-dev@axis.apache.org by Supun Kamburugamuva <su...@gmail.com> on 2007/05/31 07:36:46 UTC

Building Axiom/C using axutil_string

Hi,

At the moment Axiom has the interface required to build using
axutil_string instead of char *. AFAIK there is a problem with this
approach.

When a XML comes to the parser sometimes it doesn't know the length of
the buffer containing the XML i.e. XML is in a file. So at the
beginning parser allocates a buffer (of predefined size) and if this
buffer is not enough when reading the XML, it reallocates the buffer
again.

In Musila it is really easy to insert a '\0' character to the end of
every token and return it as a char string. But when parsing through
the XML file if the parser reallocates the buffer these strings become
invalid (points to garbage).

In Musila I have solved this problem by not keeping a char pointer to
the buffer in tokens. The tokens in Musila use an integer index to the
buffer. When I want to do an operation on a token I take the current
pointer to the buffer and the token index.

Regards,
Supun.

---------------------------------------------------------------------
To unsubscribe, e-mail: axis-c-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: axis-c-dev-help@ws.apache.org

Re: Building Axiom/C using axutil_string

Posted by Supun Kamburugamuva <su...@gmail.com>.

> > When a XML comes to the parser sometimes it doesn't know the length of
> > the buffer containing the XML i.e. XML is in a file. So at the
> > beginning parser allocates a buffer (of predefined size) and if this
> > buffer is not enough when reading the XML, it reallocates the buffer
> > again.
>
> Rather than realllocating the buffer, you can use an arry of buffers.
> This way you would not loose the origanally parsed buffer, and if it
> does not have enough room, then allocate a new buffer and read the rest
> of XML to this new buffer.
>
> Ofcource you need to put some effort to manage the array of buffers
> properly, but I think you can do that using a very simple algorithm.
>

As suggested above multiple buffer solution seems to be the best. Also
I think its performance is better than the reallocation of the
buffers. So I will implement that.

Regards,
Supun.

---------------------------------------------------------------------
To unsubscribe, e-mail: axis-c-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: axis-c-dev-help@ws.apache.org

Re: Building Axiom/C using axutil_string

Posted by Samisa Abeysinghe <sa...@wso2.com>.

Supun Kamburugamuva wrote:

> Hi,
>
> At the moment Axiom has the interface required to build using
> axutil_string instead of char *. AFAIK there is a problem with this
> approach.

I think the string approach is more cleaner than using char* for various 
reasons, specially to minimize memory duplication.

>
> When a XML comes to the parser sometimes it doesn't know the length of
> the buffer containing the XML i.e. XML is in a file. So at the
> beginning parser allocates a buffer (of predefined size) and if this
> buffer is not enough when reading the XML, it reallocates the buffer
> again.

Rather than realllocating the buffer, you can use an arry of buffers. 
This way you would not loose the origanally parsed buffer, and if it 
does not have enough room, then allocate a new buffer and read the rest 
of XML to this new buffer.

Ofcource you need to put some effort to manage the array of buffers 
properly, but I think you can do that using a very simple algorithm.

>
> In Musila it is really easy to insert a '\0' character to the end of
> every token and return it as a char string. But when parsing through
> the XML file if the parser reallocates the buffer these strings become
> invalid (points to garbage).
>
> In Musila I have solved this problem by not keeping a char pointer to
> the buffer in tokens. The tokens in Musila use an integer index to the
> buffer. When I want to do an operation on a token I take the current
> pointer to the buffer and the token index.

Current AXIOM/C implementation cannot live with this index appraoch. If 
we are to do that, then the amount of change required is too much - may 
be in Axis2 2.0, but not in 1.x familiy. There is an issue raised by 
James in Jirs to use buffer plus length for stigns but we did not do 
that for 1.0 because of the amunt of changes required.
However, if you consider the multi buffer approach, then you can live 
with current string API.

Samisa...

>
> Regards,
> Supun.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: axis-c-dev-unsubscribe@ws.apache.org
> For additional commands, e-mail: axis-c-dev-help@ws.apache.org
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: axis-c-dev-unsubscribe@ws.apache.org
For additional commands, e-mail: axis-c-dev-help@ws.apache.org