You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@commons.apache.org by Simon Kitching <si...@ecnetwork.co.nz> on 2004/05/10 01:21:56 UTC

Re: [digester] digester adds sub-objects before fully created

Hi,

On Mon, 2004-05-10 at 01:05, gary and sarah wrote:
> I am using digester to read the following file
> 
> with the following as an example of a rules
> 
>         // x axis
>         digester.addObjectCreate("tensor_frame/x_axis", 
> BasicProjections.class);
> 
>         digester.addCallMethod("tensor_frame/x_axis","setAxis",1,new 
> Class[]{Axis.class});
>         digester.addObjectParam("tensor_frame/x_axis",0,Axis.X);
> 
>         digester.addBeanPropertySetter ("tensor_frame/x_axis/x", "x");
>         digester.addBeanPropertySetter ("tensor_frame/x_axis/y", "y");
>         digester.addBeanPropertySetter ("tensor_frame/x_axis/z", "z");
>         digester.addSetNext("tensor_frame/x_axis", "addProjections");
> 
>         // tensor_frame
>         digester.addObjectCreate("tensor_frame", BasicTensorFrame.class);
> 
> however, basicTensorFrame.addprojections is being called with an 
> impcompletely constructed BasicProjections object...
> 
> this apppears to be because the end clauses of rules are being called in 
> the reverse order from that in which they are declared:
> 

It's more a side-effect of the stack-oriented nature of Digester. Yes,
it's counter-intuitive.

The reason that the CallMethodRule does the actual method invocation
from the end method is that you need to be sure that all the param rules
have fired. In particular, param rules that pass the element body text
need to have completed their work.

And end methods must be invoked in the reverse order relative to begin
methods to ensure that rules which manipulate the digester object stack
and param stack clean up correctly. Begin methods commonly push onto the
stack, and end methods clean up the stack; failing to fire end in
reverse order will severely stuff up the stack(s)!

Workaround: add the SetNextRule before the CallMethodRule. Its end
method will therefore be called after the CallMethodRule's end method,
and so the object will be initialised before addProjections is invoked.
It looks a little odd, but is a 100% reliable solution.

The javadocs for the next release will have a big note pointing out this
behaviour, and the appropriate workaround. In the CVS repository there
is also an experimental alternative to CallMethodRule which doesn't
suffer from this problem, but it won't make it in the next release.

Regards,

Simon

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org

[Digester] cousin Beck

Posted by John Kristian <jk...@engineer.com>.

If you need an XML-to-Java mapper that's more programmable than Digester,
or a Java-to-XML mapper designed on similar lines, check out Beck
<http://beck.sourceforge.net/>.


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org

Re: Digester trimming whitespaces

Posted by Adam Pelletier <ad...@moebius-solutions.com>.

Fixed it with a HttpServletRequestWrapper subclass that wraps up the request 
and the DiskFileUpload and makes it all work together.  Thanks.

----- Original Message ----- 
From: "robert burrell donkin" <ro...@blueyonder.co.uk>
To: "Jakarta Commons Users List" <co...@jakarta.apache.org>
Sent: Thursday, October 07, 2004 1:59 PM
Subject: Re: Digester trimming whitespaces


>
> On 3 Oct 2004, at 22:51, Simon Kitching wrote:
>
>> On Mon, 2004-10-04 at 11:33, robert burrell donkin wrote:
>>>>
>>>> I would recommend that you take a copy of the source of whatever rule
>>>> is
>>>> causing you problems and rename the class (including changing the
>>>> package declaration to something in your namespace), then delete the
>>>> trim() call.
>>>
>>> i'm not sure whether this would do it.
>>>
>>> i suspect that what would be needed would be for an additional flag to
>>> be added to digester that would pass on all calls to
>>> ignorableWhitespace to characters. depending on the parser used, some
>>> configuration may be necessary to ensure that the whitespace is passed
>>> on to digester.
>>
>> My understanding of "ignorable whitespace" is that when there is no DTD
>> or schema, whitespace is never ignorable; any text within an element is
>> reported via the "characters" callback. When there is a DTD or schema
>> present, and it indicates that a particular element has "element content
>> only" then any whitespace found in the element is reported as "ignorable
>> whitespace" instead of being reported via the "characters" method.
>>
>> So as far as I can see, this is not relevant to Digester. If a document
>> has a schema/DTD and that DTD specifies that element <foo> is not
>> supposed to have any text within it (just child elements) then we really
>> don't care about whether there is whitespace present or not.
>
> +1
>
> i've had a poke around and i recon that you're probably right on this one.
>
>> I think it might be possible for the Digester class itself to trim or
>> not trim the body text, instead of the individual rules doing it. But
>> that would then force the same "to trim or not to trim" setting to be
>> present for every rule, making it impossible (for example) to allow
>> whitespace in text within the <description> element but to ignore it
>> inside the <location-code> element.
>
> +1
>
> on reflection, i'd probably support added a property (to allow trimming or 
> not) or (alternative) a post processing hook for a subclass to those rules 
> that trim.
>
> - robert
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-user-help@jakarta.apache.org
>
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org

Re: Digester trimming whitespaces

Posted by robert burrell donkin <ro...@blueyonder.co.uk>.

On 3 Oct 2004, at 22:51, Simon Kitching wrote:

> On Mon, 2004-10-04 at 11:33, robert burrell donkin wrote:
>>>
>>> I would recommend that you take a copy of the source of whatever rule
>>> is
>>> causing you problems and rename the class (including changing the
>>> package declaration to something in your namespace), then delete the
>>> trim() call.
>>
>> i'm not sure whether this would do it.
>>
>> i suspect that what would be needed would be for an additional flag to
>> be added to digester that would pass on all calls to
>> ignorableWhitespace to characters. depending on the parser used, some
>> configuration may be necessary to ensure that the whitespace is passed
>> on to digester.
>
> My understanding of "ignorable whitespace" is that when there is no DTD
> or schema, whitespace is never ignorable; any text within an element is
> reported via the "characters" callback. When there is a DTD or schema
> present, and it indicates that a particular element has "element 
> content
> only" then any whitespace found in the element is reported as 
> "ignorable
> whitespace" instead of being reported via the "characters" method.
>
> So as far as I can see, this is not relevant to Digester. If a document
> has a schema/DTD and that DTD specifies that element <foo> is not
> supposed to have any text within it (just child elements) then we 
> really
> don't care about whether there is whitespace present or not.

+1

i've had a poke around and i recon that you're probably right on this 
one.

> I think it might be possible for the Digester class itself to trim or
> not trim the body text, instead of the individual rules doing it. But
> that would then force the same "to trim or not to trim" setting to be
> present for every rule, making it impossible (for example) to allow
> whitespace in text within the <description> element but to ignore it
> inside the <location-code> element.

+1

on reflection, i'd probably support added a property (to allow trimming 
or not) or (alternative) a post processing hook for a subclass to those 
rules that trim.

- robert


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org

Building feedParser with Maven..

Posted by Marco Mistroni <mm...@waersystems.com>.

Hello all,
	 I have downloaded sources of feedparser and I am trying to
Build it using Maven.
I am getting following exception during compilation..

can
not resolve symbol
symbol  : class XPath
location: class org.apache.commons.feedparser.BaseParser
        XPath xpath = new XPath( query );


>From which jar file XPath is supposed to belong?

I am using project.xml downloaded from 

http://svn.apache.org/repos/asf/jakarta/commons/proper/feedparser/trunk


thanx in advance and regards
	marco


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org

Error in building Commons feedparser with Maven

Posted by Marco Mistroni <mm...@waersystems.com>.

Hello all,
	 I have downloaded sources of feedparser and I am trying to
Build it using Maven. I am getting following exception during
compilation..

can
not resolve symbol
symbol  : class XPath
location: class org.apache.commons.feedparser.BaseParser
        XPath xpath = new XPath( query );


>From which jar file XPath is supposed to belong?

I am using project.xml downloaded from 

http://svn.apache.org/repos/asf/jakarta/commons/proper/feedparser/trunk


thanx in advance and regards
	marco


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org

Re: Digester trimming whitespaces

Posted by Simon Kitching <si...@ecnetwork.co.nz>.

On Mon, 2004-10-04 at 11:33, robert burrell donkin wrote:
> >
> > I would recommend that you take a copy of the source of whatever rule 
> > is
> > causing you problems and rename the class (including changing the
> > package declaration to something in your namespace), then delete the
> > trim() call.
> 
> i'm not sure whether this would do it.
> 
> i suspect that what would be needed would be for an additional flag to 
> be added to digester that would pass on all calls to 
> ignorableWhitespace to characters. depending on the parser used, some 
> configuration may be necessary to ensure that the whitespace is passed 
> on to digester.

My understanding of "ignorable whitespace" is that when there is no DTD
or schema, whitespace is never ignorable; any text within an element is
reported via the "characters" callback. When there is a DTD or schema
present, and it indicates that a particular element has "element content
only" then any whitespace found in the element is reported as "ignorable
whitespace" instead of being reported via the "characters" method.

So as far as I can see, this is not relevant to Digester. If a document
has a schema/DTD and that DTD specifies that element <foo> is not
supposed to have any text within it (just child elements) then we really
don't care about whether there is whitespace present or not.

I think it might be possible for the Digester class itself to trim or
not trim the body text, instead of the individual rules doing it. But
that would then force the same "to trim or not to trim" setting to be
present for every rule, making it impossible (for example) to allow
whitespace in text within the <description> element but to ignore it
inside the <location-code> element.

Regards,

Simon 

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org

Re: Digester trimming whitespaces

Posted by robert burrell donkin <ro...@blueyonder.co.uk>.

On 3 Oct 2004, at 22:12, Simon Kitching wrote:

> On Fri, 2004-10-01 at 20:40, Marco Mistroni wrote:
>> Hello all,
>> 	I am currently having a problem (?) with digester in the
>> Sense that in parsing XML is 'trimming' whitespaces..
>
> Hi Marco,
>
> Yes, some Digester rules do this deliberately. If you look at the 
> source
> for CallParamRule, CallMethodRule, etc. you will see something like:
>   bodyText = bodyText.trim();
>
> As this code precedes my involvement in Digester, I can't say exactly
> what the motivation was for doing this, but presume there was a good
> reason.

craig or scott would be needed to give a definitive answer to this one.

my guess is that since the handling of whitespace by parsers has been 
variable, the best way to achieve consistency is to lose all 
whitespace.

> I certainly have been using digester fairly heavily and not
> needed to allow leading/trailing whitespace in element bodies. However 
> I
> can understand that some people might need to.
>
> I would recommend that you take a copy of the source of whatever rule 
> is
> causing you problems and rename the class (including changing the
> package declaration to something in your namespace), then delete the
> trim() call.

i'm not sure whether this would do it.

i suspect that what would be needed would be for an additional flag to 
be added to digester that would pass on all calls to 
ignorableWhitespace to characters. depending on the parser used, some 
configuration may be necessary to ensure that the whitespace is passed 
on to digester.

in terms of the rules, it would probably be neater and quicker to move 
the trim call out of the rule (where it may be called multiple times) 
and into digester. when digester was set to ignore whitespace, 
whitespace would be trimmed. when the setting was to record whitespace, 
ignorableWhitespace would pass the whitespace on to characters and the 
output wouldn't be trimmed.

should be quite an easy change to make but ensuring that recording 
whitespace worked might prove more tricky...

- robert

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org

Re: Digester trimming whitespaces

Posted by Craig McClanahan <cr...@gmail.com>.

On Mon, 04 Oct 2004 10:12:36 +1300, Simon Kitching
<si...@ecnetwork.co.nz> wrote:
> On Fri, 2004-10-01 at 20:40, Marco Mistroni wrote:
> > Hello all,
> >       I am currently having a problem (?) with digester in the
> > Sense that in parsing XML is 'trimming' whitespaces..
> 
> Hi Marco,
> 
> Yes, some Digester rules do this deliberately. If you look at the source
> for CallParamRule, CallMethodRule, etc. you will see something like:
>   bodyText = bodyText.trim();
> 
> As this code precedes my involvement in Digester, I can't say exactly
> what the motivation was for doing this, but presume there was a good
> reason. I certainly have been using digester fairly heavily and not
> needed to allow leading/trailing whitespace in element bodies. However I
> can understand that some people might need to.
> 

Here's a real simple use case ... parsing web.xml files in Tomcat. 
Regardless of the technical niceties of how XML parsers actually work,
users expect something like:

  <servlet-class>com.mypackage.MyServlet</servlet-class>

and

  <servlet-class>   com.mypackage.MyServlet   </servlet-class>

and

  <servlet-class>
    com.mypackage.MyServlet
  </servlet-class>

to have the same semantic effect.  That is accomplished by trimming
whitespace off the body content before using its contents. 
Consistency (the "principle of least surprise") will then encourage us
to do the same thing anywhere else the body content is processed, so
we did.

> I would recommend that you take a copy of the source of whatever rule is
> causing you problems and rename the class (including changing the
> package declaration to something in your namespace), then delete the
> trim() call.
> 
> If you feel like contributing a patch to add some kind of boolean flag
> to the original Rule class to allow people to enable/disable trimming of
> whitespace (including unit tests) then I think there would be some
> interest in applying this to Digester - I certainly think that would be
> useful.

That would indeed be useful for some scenarios for using Digester
other than parsing configuration files.  However, even there I suspect
it's going to be quite common for the producer of XML content to be
able to add newlines and indentations (for readability) of body
content, so the need to avoid trimming is certainly not going to be
universal.

> 
> Regards,
> 
> Simon
> 

Craig

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org

Re: Digester trimming whitespaces

Posted by Simon Kitching <si...@ecnetwork.co.nz>.

On Fri, 2004-10-01 at 20:40, Marco Mistroni wrote:
> Hello all,
> 	I am currently having a problem (?) with digester in the
> Sense that in parsing XML is 'trimming' whitespaces..

Hi Marco,

Yes, some Digester rules do this deliberately. If you look at the source
for CallParamRule, CallMethodRule, etc. you will see something like:
  bodyText = bodyText.trim();

As this code precedes my involvement in Digester, I can't say exactly
what the motivation was for doing this, but presume there was a good
reason. I certainly have been using digester fairly heavily and not
needed to allow leading/trailing whitespace in element bodies. However I
can understand that some people might need to.

I would recommend that you take a copy of the source of whatever rule is
causing you problems and rename the class (including changing the
package declaration to something in your namespace), then delete the
trim() call.

If you feel like contributing a patch to add some kind of boolean flag
to the original Rule class to allow people to enable/disable trimming of
whitespace (including unit tests) then I think there would be some
interest in applying this to Digester - I certainly think that would be
useful.

Regards,

Simon

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org

Re: Digester trimming whitespaces

Posted by Reid Pinchback <re...@yahoo.com>.

Not 100% certain about this, but just did a quick pass
through the Digester source.  There is a
"ignorableWhitespace" callback in the Digester class
that currently doesn't do anything.  Maybe it needs to
be checking the configuration of the parser to decide
if it should be tacking on the whitespace to the
bodyText of the current element?

--- Marco Mistroni <mm...@waersystems.com> wrote:

> Hello all,
> 	I am currently having a problem (?) with digester
> in the
> Sense that in parsing XML is 'trimming'
> whitespaces..
> 
> Here are the details of what is happening..

_______________________________
Do you Yahoo!?
Declare Yourself - Register online to vote today!
http://vote.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org

Digester trimming whitespaces

Posted by Marco Mistroni <mm...@waersystems.com>.

Hello all,
	I am currently having a problem (?) with digester in the
Sense that in parsing XML is 'trimming' whitespaces..

Here are the details of what is happening..


I have a client that is requesting XML from a webservice.
Once received, the client parses it using digester..

In the XML I have a <content> tag that contains some data.</content>

Normally, (in 5 cases out of 6), xml is like this

...
<content>|HEADER|some csv data|FOOTER|</content>


but in 1 case out of 6, xml is like this..

<content> some csv data</content>

yes, there is a WHITESPACE before the text.

When digester is parsing the XML, the result of the
<content> tag will be    'some csv data',  AND THE
whitespace is trimmed....


anyone can help?
I found a post by Robert burrell donkin about that (it was dated  Sun,
13 Jul 2003 15:31:41 -0700) 

I found no followsup.... anyone can update me, or tell me how to solve
my problem?

Thanx in advance and regards
	marco





---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org