You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@commons.apache.org by Luke Shannon <ls...@futurebrand.com> on 2005/02/07 20:30:06 UTC

Digester Quesion

Hello All;

I have been successfully processing a series of documents with the following
code:

//thanks to Wendy
public TreeMap digest() throws IOException, SAXException {
  digester = new Digester();
  digester.setValidating(false);
  digester.addObjectCreate("DATA/VERSION", TreeMap.class);
  digester.addCallMethod("DATA/VERSION/ITEM", "put", 2);
  digester.addCallParam("DATA/VERSION/ITEM", 0, "NAME");
  digester.addCallParam("DATA/VERSION/ITEM", 1);
  return (TreeMap)digester.parse(parseMe);

 }

What I end up back is a TreeMap containing the value of the ITEM@name as the
Map key and the value of <ITEM> as the Map value. Things were running
smoothly until I was told today that some <ITEM> contain a size attribute.
Example:

 <ITEM DIR="air3.pdf" HEIGHT="-1" NAME="kcfileupload" SIZE="117960"
STYPE="file" TYPE="upload" WIDTH="1">air3.pdf</ITEM>

In this example I would need the value 117960 as well.

This is where my lack of digester knowledge comes apparent.

My first reaction was to create a pattern to ITEM@size (I have an idea of
how to do this but am not entirely sure) and than append it to the value
that gets used in this line:

digester.addCallParam("DATA/VERSION/ITEM", 1);

Note: I would include a deliminator I could use later to separate the
values. Again I have idea of how I would do this but am not entirely sure.

What I would like to know is, syntaxically is this possible? Is there an
easier way within the Digester API to do this?

Thanks,

Luke



---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org


Re: Digester Quesion

Posted by Luke Shannon <ls...@futurebrand.com>.
This is a very good idea. Once I get this going I should be in a good
position to implement a new digest method I was playing with last night.

I wish there were more reference books on this. The only one I could find
was  Jakarta Commons Cookbook by  Timothy M. O'Brien. If anyone knows of
other please let me know.

Thanks for your help.

Luke

----- Original Message ----- 
From: "Simon Kitching" <sk...@apache.org>
To: "Jakarta Commons Users List" <co...@jakarta.apache.org>
Sent: Tuesday, February 08, 2005 7:17 PM
Subject: Re: Digester Quesion


> On Tue, 2005-02-08 at 18:56 -0500, Luke Shannon wrote:
> > Thanks for your response Simon.
> >
> > The program is already mostly working with dummy data. In my previous
post I
> > mentioned I needed three things. All three come from the xml, one I am
> > already getting (using digester). The other 2 I have been hard coding.
> > Requirements have changed and now I need the actual data.
>
> In that case, I recommend updating your program to handle the new items
> but initialise that data the same way you currently initialise your
> dummy data (ie not using digester). Once you have figured out how you
> will store your data (which I expect will involve new members on an Item
> class) and got the rest of the app working with that, then I am sure we
> can help you to get Digester to populate those items from the xml.
>
> Good luck,
>
> Simon
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-user-help@jakarta.apache.org
>



---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org


Re: Digester Quesion

Posted by Luke Shannon <ls...@futurebrand.com>.
I think I may have this figured. I can now parse data successfully with some
test documents.

Here is my Data class and my digest method.

public class Data {
    private String type;
    //the document contains only one size attribute if there is one
    private String size;
    private TreeMap fields;

    public Data() {
        type = null;
        size = null;
        fields = new TreeMap();

    }

    /**
     * @return Returns the fields.
     */
    public TreeMap getFields() {
        return fields;
    }
    /**
     * @param fields The fields to set.
     */
    public void setFields(Object key, Object value) {
        fields.put(key, value);
    }
    /**
     * @return Returns the size. If a null is returned the document didn't
have a size attribute
     */
    public String getSize() {
        return size;
    }
    /**
     * @param size The size to set.
     */
    public void setSize(String size) {
        this.size = size;
    }
    /**
     * @return Returns the type.
     */
    public String getType() {
        return type;
    }
    /**
     * @param type The type to set.
     */
    public void setType(String type) {
        this.type = type;
    }

    public String toString() {
        StringBuffer returnMe = new StringBuffer();
        returnMe.append("This data instance contains : \n");
        returnMe.append("Type : " + type + "\n");
        returnMe.append("Size (null indicates no value is document) : " +
size + "\n");
        returnMe.append("The fields are : \n");
        Iterator keys = fields.keySet().iterator();
        while (keys.hasNext()) {
            String field = (String) keys.next();
            String value = (String) fields.get(field);
            returnMe.append("Key: " + field + "\n");
            returnMe.append("Values: " + value + "\n");
        }
        return returnMe.toString();
    }
}

public Data digestData() throws IOException, SAXException {
  digester = new Digester();
  digester.setValidating(false);
  digester.addObjectCreate("DATA", Data.class);
  //get the type
  digester.addCallMethod("DATA/DEF/TYPE", "setType", 0);
  //get the size attribute
  digester.addCallMethod("DATA/VERSION/ITEM", "setSize", 1);
  digester.addCallParam("DATA/VERSION/ITEM", 0, "SIZE");
  //add each item attributes to Data's Treemap
  digester.addCallMethod("DATA/VERSION/ITEM", "setFields", 2);
  digester.addCallParam("DATA/VERSION/ITEM", 0, "NAME");
  digester.addCallParam("DATA/VERSION/ITEM", 1);
  return (Data)digester.parse(parseMe);

 }

----- Original Message ----- 
From: "Simon Kitching" <sk...@apache.org>
To: "Jakarta Commons Users List" <co...@jakarta.apache.org>
Sent: Tuesday, February 08, 2005 7:17 PM
Subject: Re: Digester Quesion


> On Tue, 2005-02-08 at 18:56 -0500, Luke Shannon wrote:
> > Thanks for your response Simon.
> >
> > The program is already mostly working with dummy data. In my previous
post I
> > mentioned I needed three things. All three come from the xml, one I am
> > already getting (using digester). The other 2 I have been hard coding.
> > Requirements have changed and now I need the actual data.
>
> In that case, I recommend updating your program to handle the new items
> but initialise that data the same way you currently initialise your
> dummy data (ie not using digester). Once you have figured out how you
> will store your data (which I expect will involve new members on an Item
> class) and got the rest of the app working with that, then I am sure we
> can help you to get Digester to populate those items from the xml.
>
> Good luck,
>
> Simon
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-user-help@jakarta.apache.org
>



---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org


Re: Digester Quesion

Posted by Simon Kitching <sk...@apache.org>.
On Tue, 2005-02-08 at 18:56 -0500, Luke Shannon wrote:
> Thanks for your response Simon.
> 
> The program is already mostly working with dummy data. In my previous post I
> mentioned I needed three things. All three come from the xml, one I am
> already getting (using digester). The other 2 I have been hard coding.
> Requirements have changed and now I need the actual data.

In that case, I recommend updating your program to handle the new items
but initialise that data the same way you currently initialise your
dummy data (ie not using digester). Once you have figured out how you
will store your data (which I expect will involve new members on an Item
class) and got the rest of the app working with that, then I am sure we
can help you to get Digester to populate those items from the xml.

Good luck,

Simon



---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org


Re: [digester] Re: Digester Quesion

Posted by Luke Shannon <ls...@futurebrand.com>.
Thanks Simon.

----- Original Message ----- 
From: "Simon Kitching" <sk...@apache.org>
To: "Jakarta Commons Users List" <co...@jakarta.apache.org>
Sent: Wednesday, February 09, 2005 7:08 PM
Subject: Re: [digester] Re: Digester Quesion


> On Wed, 2005-02-09 at 18:26 -0500, Luke Shannon wrote:
> > Hi All;
> >
> > Someone just looked at my code that uses Digester and commented that
they
> > thought it would be slow (compared to SAX or DOM directly) because it
would
> > need to use "reflection a lot".
> >
> > I am not sure what this means. But since I am using  digester a lot I
would
> > like to know if it is does use a lot of resources (compared to other XML
> > parsing API) and if there is anything I can do about it. If someone
could
> > point me to some reference articles that would be great too, I may need
to
> > present some evidence to support my choice of Digester.
>
> Yes, Digester does use reflection to invoke methods on your classes, and
> that does carry a small performance penalty. Hand-coded SAX-based code
> will be faster. Anything using DOM will be slower, as DOM has its own
> penalties that far outweigh those of Digester.
>
> But if you really cared about performance, you wouldn't be writing your
> application in java anyway, right? Nor using XML which is much slower
> than binary-encoded data formats.
>
> In most cases, the performance difference between Digester and
> hand-coded SAX handling just isn't significant. Ok, maybe if you're
> repeatedly processing massive volumes of xml in some kind of
> performance-critical manner (eg a newspaper print-preparation system)
> you should look at some alternatives. But even then you need to take
> into account the costs of development and maintenance (lower when using
> a decent library rather than hand-coding) vs the costs incurred by the
> slightly lower performance.
>
> You can find a list of alternatives to digester here:
>   http://wiki.apache.org/jakarta-commons/Digester/WhyUseDigester
>
> Regards,
>
> Simon
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-user-help@jakarta.apache.org
>



---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org


Re: [digester] Re: Digester Quesion

Posted by Simon Kitching <sk...@apache.org>.
On Wed, 2005-02-09 at 18:26 -0500, Luke Shannon wrote:
> Hi All;
> 
> Someone just looked at my code that uses Digester and commented that they
> thought it would be slow (compared to SAX or DOM directly) because it would
> need to use "reflection a lot".
> 
> I am not sure what this means. But since I am using  digester a lot I would
> like to know if it is does use a lot of resources (compared to other XML
> parsing API) and if there is anything I can do about it. If someone could
> point me to some reference articles that would be great too, I may need to
> present some evidence to support my choice of Digester.

Yes, Digester does use reflection to invoke methods on your classes, and
that does carry a small performance penalty. Hand-coded SAX-based code
will be faster. Anything using DOM will be slower, as DOM has its own
penalties that far outweigh those of Digester.

But if you really cared about performance, you wouldn't be writing your
application in java anyway, right? Nor using XML which is much slower
than binary-encoded data formats. 

In most cases, the performance difference between Digester and
hand-coded SAX handling just isn't significant. Ok, maybe if you're
repeatedly processing massive volumes of xml in some kind of
performance-critical manner (eg a newspaper print-preparation system)
you should look at some alternatives. But even then you need to take
into account the costs of development and maintenance (lower when using
a decent library rather than hand-coding) vs the costs incurred by the
slightly lower performance.

You can find a list of alternatives to digester here:
  http://wiki.apache.org/jakarta-commons/Digester/WhyUseDigester

Regards,

Simon


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org


Re: [digester] Re: Digester Quesion

Posted by Luke Shannon <ls...@futurebrand.com>.
Hi All;

Someone just looked at my code that uses Digester and commented that they
thought it would be slow (compared to SAX or DOM directly) because it would
need to use "reflection a lot".

I am not sure what this means. But since I am using  digester a lot I would
like to know if it is does use a lot of resources (compared to other XML
parsing API) and if there is anything I can do about it. If someone could
point me to some reference articles that would be great too, I may need to
present some evidence to support my choice of Digester.

Thanks,

Luke

----- Original Message ----- 
From: "Wendy Smoak" <ja...@wendysmoak.com>
To: "Jakarta Commons Users List" <co...@jakarta.apache.org>
Sent: Tuesday, February 08, 2005 7:24 PM
Subject: [digester] Re: Digester Quesion


> From: "Luke Shannon" <ls...@futurebrand.com>
>
> > My previous email didn't make sense because I don't fully understand how
> > digester works or how to best use it.
>
> Digester can get the data out of the XML, but right now you don't have
> anywhere to put it.  And that's partly my fault-- in your first message
you
> had 'Version' and 'Item' classes, and I talked you out of using them.
(You
> said you wanted a TreeMap, so I helped you populate a TreeMap!)  I think
you
> were on the right track originally, see what you get for listening to me?
:)
>
> -- 
> Wendy Smoak
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-user-help@jakarta.apache.org
>



---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org


Re: [digester] Re: Digester Quesion

Posted by Luke Shannon <ls...@futurebrand.com>.
No problem Wendy. I told you I didn't need any of the data other than what
was in the item tags. For that application, I liked your solution a lot.

Part of the problem is this document is not organized very well (in my
option). The item tags are treated differently depending on the attributes
they contain, and all item tags across the document don't contain the same
attributes.

The silver lining of this cloud is when I finally do get this working I
think I will finally have a good understanding of digester :-)

Luke



----- Original Message ----- 
From: "Wendy Smoak" <ja...@wendysmoak.com>
To: "Jakarta Commons Users List" <co...@jakarta.apache.org>
Sent: Tuesday, February 08, 2005 7:24 PM
Subject: [digester] Re: Digester Quesion


> From: "Luke Shannon" <ls...@futurebrand.com>
>
> > My previous email didn't make sense because I don't fully understand how
> > digester works or how to best use it.
>
> Digester can get the data out of the XML, but right now you don't have
> anywhere to put it.  And that's partly my fault-- in your first message
you
> had 'Version' and 'Item' classes, and I talked you out of using them.
(You
> said you wanted a TreeMap, so I helped you populate a TreeMap!)  I think
you
> were on the right track originally, see what you get for listening to me?
:)
>
> -- 
> Wendy Smoak
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-user-help@jakarta.apache.org
>



---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org


[digester] Re: Digester Quesion

Posted by Wendy Smoak <ja...@wendysmoak.com>.
From: "Luke Shannon" <ls...@futurebrand.com>

> My previous email didn't make sense because I don't fully understand how
> digester works or how to best use it.

Digester can get the data out of the XML, but right now you don't have
anywhere to put it.  And that's partly my fault-- in your first message you
had 'Version' and 'Item' classes, and I talked you out of using them.  (You
said you wanted a TreeMap, so I helped you populate a TreeMap!)  I think you
were on the right track originally, see what you get for listening to me? :)

-- 
Wendy Smoak


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org


Re: Digester Quesion

Posted by Luke Shannon <ls...@futurebrand.com>.
Thanks for your response Simon.

The program is already mostly working with dummy data. In my previous post I
mentioned I needed three things. All three come from the xml, one I am
already getting (using digester). The other 2 I have been hard coding.
Requirements have changed and now I need the actual data.

My previous email didn't make sense because I don't fully understand how
digester works or how to best use it. All I know if there is data I need
from the schema I posted and I don't want to have to use SAX or DOM to get
it. :-)

I think with some further reading and practice I should be able to figure
this out.

I hope I haven't frustrated anyone with confusing posts.

Sincerely,

Luke

----- Original Message ----- 
From: "Simon Kitching" <sk...@apache.org>
To: "Jakarta Commons Users List" <co...@jakarta.apache.org>
Sent: Tuesday, February 08, 2005 6:16 PM
Subject: Re: Digester Quesion


> On Tue, 2005-02-08 at 15:27 -0500, Luke Shannon wrote:
> > Hi Wendy;
> >
> > I need a little more advice. It seems I now need more than the item tag.
> >
> > Here is a full xml sample:
>
> Hi Luke,
>
> I think you need to sort out your data representation first, before even
> thinking about Digester. Your description below of how you want to store
> this data in memory doesn't make a whole lot of sense; writing code that
> *uses* this data will probably make this much clearer to you.
>
> So I suggest writing the rest of your program first, using code to
> initialise dummy data. Once you've got that working, then look again at
> how you can use Digester to read XML into the classes you've developed.
>
> Regards,
>
> Simon
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-user-help@jakarta.apache.org
>



---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org


Re: Digester Quesion

Posted by Simon Kitching <sk...@apache.org>.
On Tue, 2005-02-08 at 15:27 -0500, Luke Shannon wrote:
> Hi Wendy;
> 
> I need a little more advice. It seems I now need more than the item tag.
> 
> Here is a full xml sample:

Hi Luke,

I think you need to sort out your data representation first, before even
thinking about Digester. Your description below of how you want to store
this data in memory doesn't make a whole lot of sense; writing code that
*uses* this data will probably make this much clearer to you.

So I suggest writing the rest of your program first, using code to
initialise dummy data. Once you've got that working, then look again at
how you can use Digester to read XML into the classes you've developed.

Regards,

Simon



---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org


Re: Digester Quesion

Posted by Luke Shannon <ls...@futurebrand.com>.
Hi Wendy;

I need a little more advice. It seems I now need more than the item tag.

Here is a full xml sample:

<DATA>
<DEF>
  <TYPE>138</TYPE>
  <TYPES />
  </DEF>
<VERSION>
  <ITEM NAME="category" TYPE="text">Category 4</ITEM>
  <ITEM NAME="provider" TYPE="text">luke</ITEM>
  <ITEM NAME="progress_ref" TYPE="text">1100816505287</ITEM>
  <ITEM NAME="name" TYPE="text">ppt</ITEM>
  <ITEM NAME="desc" TYPE="text">dsfds</ITEM>
    <ITEM NAME="sort" TYPE="text">8</ITEM>
    <!-- an item like this one is optional -->
  <ITEM DIR="donutwars.ppt" HEIGHT="-1" NAME="kcfileupload" SIZE="9728"
STYPE="file" TYPE="upload" WIDTH="-1">donutwars.ppt</ITEM>
  </VERSION>
</DATA>

>From this I need:

1. As before a TreeMap containing @NAME (map key) and the value of the item
tag (map value).
2. If there is a SIZE attribute I need it.
3. I need the TYPE value now too.

What do you reccomend? Create a DATA object which contains a TreeMap for the
ITEMs, size variable for SIZE (if there is one) and a type variable for
TYPE?

Any tips you can give me would be great.

Luke



----- Original Message ----- 
From: "Wendy Smoak" <ja...@wendysmoak.com>
To: "Jakarta Commons Users List" <co...@jakarta.apache.org>
Sent: Monday, February 07, 2005 3:27 PM
Subject: Re: Digester Quesion


> From: "Luke Shannon" lshannon@futurebrand.com
>
> > What I end up back is a TreeMap containing the value of the ITEM@name as
> the
> > Map key and the value of <ITEM> as the Map value. Things were running
> > smoothly until I was told today that some <ITEM> contain a size
attribute.
> > Example:
> >
> >  <ITEM DIR="air3.pdf" HEIGHT="-1" NAME="kcfileupload" SIZE="117960"
> > STYPE="file" TYPE="upload" WIDTH="1">air3.pdf</ITEM>
>
> Right now you have a Map where the key/value pair is String/String,
correct?
> And in the example above, you're currently _only_ storing the 'NAME'
> attribute, and whatever is in the body of the <ITEM> tag?
>
> Didn't I talk you out of having an 'Item' class last week?  Looks like you
> need it again!  If an 'item' has more properties, it seems to me like
you'll
> need an Item 'bean', with get/set methods for the various properties.
>
> If you create an Item when you see DATA/VERSION/ITEM, then you can use the
> Item as the second parameter to 'put' on the TreeMap.  I have an example
of
> something similar-- it creates an ArrayList of LazyDynaBeans, you should
be
> able to adapt it to creating a TreeMap of Items.
>
> http://wiki.wendysmoak.com/cgi-bin/wiki.pl?DigesterLazyDynaBean
>
> -- 
> Wendy Smoak
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-user-help@jakarta.apache.org
>



---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org


Re: Digester Quesion

Posted by Wendy Smoak <ja...@wendysmoak.com>.
From: "Luke Shannon" lshannon@futurebrand.com

> What I end up back is a TreeMap containing the value of the ITEM@name as
the
> Map key and the value of <ITEM> as the Map value. Things were running
> smoothly until I was told today that some <ITEM> contain a size attribute.
> Example:
>
>  <ITEM DIR="air3.pdf" HEIGHT="-1" NAME="kcfileupload" SIZE="117960"
> STYPE="file" TYPE="upload" WIDTH="1">air3.pdf</ITEM>

Right now you have a Map where the key/value pair is String/String, correct?
And in the example above, you're currently _only_ storing the 'NAME'
attribute, and whatever is in the body of the <ITEM> tag?

Didn't I talk you out of having an 'Item' class last week?  Looks like you
need it again!  If an 'item' has more properties, it seems to me like you'll
need an Item 'bean', with get/set methods for the various properties.

If you create an Item when you see DATA/VERSION/ITEM, then you can use the
Item as the second parameter to 'put' on the TreeMap.  I have an example of
something similar-- it creates an ArrayList of LazyDynaBeans, you should be
able to adapt it to creating a TreeMap of Items.

http://wiki.wendysmoak.com/cgi-bin/wiki.pl?DigesterLazyDynaBean

-- 
Wendy Smoak


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org