You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@commons.apache.org by Scott Sanders <ss...@nextance.com> on 2002/01/04 02:37:29 UTC

[Digester] Supporting mixed content

Hi all,

I am trying to use Digester in a situation where some elements have
mixed content.  For example: 

<tag name="text">Beginning effective <tag
name="variableRef">EffectiveDate</tag>, the new policy will be in
effect.</tag>

So, currently Digester just appends the characters() method to the body
text and then uses it at the endElement() method.  The problem is that I
need the "Beginning effective " and the ", the new policy..." are
actually two specifically different calls to some method, like
addNode().  I know it sounds a little like DOM, but that is what I need
at the moment.

So, how do I accomplish this?  I propose changing the characters()
method to call rules bodyText() method.  To prevent breakage, then I
need the default Rule class to append these together and then use it.

To access this functionality, a match rule would be something like
"tag/text()" (ala Xpath)

Any other ideas?  I do not want to break backward compatibility (or do
we to break compatibility, support more, and then release as 2.0?), but
I think that this is necessary to expand the flexibility of Digester.  I
will soon need to expose attributes, as I am working on a more complete
Xpath implementation.  That way we can also do something like
"tag/@name" (the name attribute of the tag element).

As far as I can see, Digester is being used for simple configuration,
but stuff like documents that contain mixed content have an impossible
time right now, and I really do not want to have to use DOM to do this.
FYI, as a random thought, Digester could not actually be used to create
a DOM right now, because of this body text issue.

Scott


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [Digester] Supporting mixed content

Posted by "Craig R. McClanahan" <cr...@apache.org>.
On Sat, 5 Jan 2002, robert burrell donkin wrote:

> Date: Sat, 5 Jan 2002 13:50:57 +0000
> From: robert burrell donkin <ro...@mac.com>
> Reply-To: Jakarta Commons Developers List <co...@jakarta.apache.org>
> To: Jakarta Commons Developers List <co...@jakarta.apache.org>
> Subject: Re: [Digester] Supporting mixed content
>
> hi scott
>
> i'm always pretty reluctant about changes to the basic Rule interface.
>

I share this reluctance.  The kinds of documents that Digester does best
at tend to be either all-attributes or all-body-content, and I don't want
to lose the simplicity that's present for those kinds of cases.

In general, I think of Digester as a nice wrapper around *SAX* parsing,
not around DOM manipulation, and it is best used when you are using
pattern matching techniques to grab out what you want and ignoring the
rest.  The use case Scott describes might be easier to deal with using
something like DOM4J or JDOM.

> i think that there might be some other ways around this problem (but i'll
> need some thinking time) so maybe it'd be a good idea to hold off changing
> the interface for a little while.
>
> - robert
>

Craig


> On Friday, January 4, 2002, at 01:37 AM, Scott Sanders wrote:
>
> > Hi all,
> >
> > I am trying to use Digester in a situation where some elements have
> > mixed content.  For example:
> >
> > <tag name="text">Beginning effective <tag
> > name="variableRef">EffectiveDate</tag>, the new policy will be in
> > effect.</tag>
> >
> > So, currently Digester just appends the characters() method to the body
> > text and then uses it at the endElement() method.  The problem is that I
> > need the "Beginning effective " and the ", the new policy..." are
> > actually two specifically different calls to some method, like
> > addNode().  I know it sounds a little like DOM, but that is what I need
> > at the moment.
> >
> > So, how do I accomplish this?  I propose changing the characters()
> > method to call rules bodyText() method.  To prevent breakage, then I
> > need the default Rule class to append these together and then use it.
> >
> > To access this functionality, a match rule would be something like
> > "tag/text()" (ala Xpath)
> >
> > Any other ideas?  I do not want to break backward compatibility (or do
> > we to break compatibility, support more, and then release as 2.0?), but
> > I think that this is necessary to expand the flexibility of Digester.  I
> > will soon need to expose attributes, as I am working on a more complete
> > Xpath implementation.  That way we can also do something like
> > "tag/@name" (the name attribute of the tag element).
> >
> > As far as I can see, Digester is being used for simple configuration,
> > but stuff like documents that contain mixed content have an impossible
> > time right now, and I really do not want to have to use DOM to do this.
> > FYI, as a random thought, Digester could not actually be used to create
> > a DOM right now, because of this body text issue.
> >
> > Scott
> >
> >
> > --
> > To unsubscribe, e-mail:   <mailto:commons-dev-unsubscribe@jakarta.apache.
> > org>
> > For additional commands, e-mail: <mailto:commons-dev-help@jakarta.apache.
> > org>
> >
>
>
> --
> To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
> For additional commands, e-mail: <ma...@jakarta.apache.org>
>
>


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [Digester] Supporting mixed content

Posted by robert burrell donkin <ro...@mac.com>.
hi scott

i'm always pretty reluctant about changes to the basic Rule interface.

i think that there might be some other ways around this problem (but i'll 
need some thinking time) so maybe it'd be a good idea to hold off changing 
the interface for a little while.

- robert

On Friday, January 4, 2002, at 01:37 AM, Scott Sanders wrote:

> Hi all,
>
> I am trying to use Digester in a situation where some elements have
> mixed content.  For example:
>
> <tag name="text">Beginning effective <tag
> name="variableRef">EffectiveDate</tag>, the new policy will be in
> effect.</tag>
>
> So, currently Digester just appends the characters() method to the body
> text and then uses it at the endElement() method.  The problem is that I
> need the "Beginning effective " and the ", the new policy..." are
> actually two specifically different calls to some method, like
> addNode().  I know it sounds a little like DOM, but that is what I need
> at the moment.
>
> So, how do I accomplish this?  I propose changing the characters()
> method to call rules bodyText() method.  To prevent breakage, then I
> need the default Rule class to append these together and then use it.
>
> To access this functionality, a match rule would be something like
> "tag/text()" (ala Xpath)
>
> Any other ideas?  I do not want to break backward compatibility (or do
> we to break compatibility, support more, and then release as 2.0?), but
> I think that this is necessary to expand the flexibility of Digester.  I
> will soon need to expose attributes, as I am working on a more complete
> Xpath implementation.  That way we can also do something like
> "tag/@name" (the name attribute of the tag element).
>
> As far as I can see, Digester is being used for simple configuration,
> but stuff like documents that contain mixed content have an impossible
> time right now, and I really do not want to have to use DOM to do this.
> FYI, as a random thought, Digester could not actually be used to create
> a DOM right now, because of this body text issue.
>
> Scott
>
>
> --
> To unsubscribe, e-mail:   <mailto:commons-dev-unsubscribe@jakarta.apache.
> org>
> For additional commands, e-mail: <mailto:commons-dev-help@jakarta.apache.
> org>
>


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [Digester] Supporting mixed content

Posted by robert burrell donkin <ro...@mac.com>.
On Friday, January 4, 2002, at 01:37 AM, Scott Sanders wrote:

> Any other ideas?  I do not want to break backward compatibility (or do
> we to break compatibility, support more, and then release as 2.0?), but
> I think that this is necessary to expand the flexibility of Digester.  I
> will soon need to expose attributes, as I am working on a more complete
> Xpath implementation.  That way we can also do something like
> "tag/@name" (the name attribute of the tag element).

XPath pattern matching rules would be really cool :)

one way that - in the past - we've got round needing more information 
without breaking the Rule interface is to expose more information as 
properties on digester. for example, getCurrentElementName() allows rules 
to automagically map on the basis of the current element name. i know that 
this smells like bad design but it does allow the Rule interface to be 
preserved. later, we might feel ready to break the interface, but maybe 
not just yet.

- robert


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Re: [Digester] Supporting mixed content

Posted by robert burrell donkin <ro...@mac.com>.
On Friday, January 4, 2002, at 01:37 AM, Scott Sanders wrote:

> Hi all,
>
> I am trying to use Digester in a situation where some elements have
> mixed content.  For example:
>
> <tag name="text">Beginning effective <tag
> name="variableRef">EffectiveDate</tag>, the new policy will be in
> effect.</tag>
>
> So, currently Digester just appends the characters() method to the body
> text and then uses it at the endElement() method.  The problem is that I
> need the "Beginning effective " and the ", the new policy..." are
> actually two specifically different calls to some method, like
> addNode().  I know it sounds a little like DOM, but that is what I need
> at the moment.
>
> So, how do I accomplish this?  I propose changing the characters()
> method to call rules bodyText() method.  To prevent breakage, then I
> need the default Rule class to append these together and then use it.

one possible approach might be to use a list (or something) of 
StringBuffers. the idea is that you create a new StringBuffer when a new 
piece of mixed context begins.

to use your example above, digester would begin by create a StringBuffer. 
'Beginning effective' would be appended. a list containing the 
StringBuffer would be then pushed onto the body text stack. it'll be 
popped off when the inner tag element is finished. digester would then 
create a new StringBuffer, and add it to the end of the list. ' the new 
policy will be in effect.' would then be appended to last StringBuffer in 
the list. when the outer tag element is finished, digester concatinates 
the contents of the buffers and calls bodyText on the rules with the 
result. the list of StringBuffers could be presented as a property.

it's a little convoluted but maybe this would do what you want without 
having to change the Rule interface.

- robert


--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>