You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@maven.apache.org by John Casey <jd...@commonjava.org> on 2011/08/03 00:00:25 UTC

POM reader that preserves CDATA?

Hi all,

I'm working on some tooling for $dayjob that needs to manipulate POM 
files according to certain rules.

The problem I'm running into is that some of the POMs it much manipulate 
contain CDATA sections, comments, etc. Also, since the modified POM 
often will be used as the basis for a patch file, I'd like to preserve 
as much of the ordering and existing whitespace in the file as possible, 
to minimize the patchfile size.

I'm currently using the JDom-driven, Modello-generated writer, coupled 
with the stock XPP3-driven reader (not the best, I know). It's losing 
the CDATA (big problem) and injecting ^M (wrong line ending, little 
problem)...

Does anyone have experience with this? Anyone maybe have an advanced POM 
reader/writer stashed somewhere that can preserve CDATA and the like?

Thanks,

-john

-- 
John Casey
Developer, PMC Chair - Apache Maven (http://maven.apache.org)
Blog: http://www.johnofalltrades.name/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: POM reader that preserves CDATA?

Posted by Daniel Kulp <dk...@apache.org>.
You could TRY feeding the pom schema into JAXB to generate JAXB objects from 
it.   From there, using a JAXBContext, you can call context.createBinder and 
then use that to unmarshall the XML DOM into JAXB objects.  You can then 
manipulate the JAXB objects and then have it update the XML after word using 
the Binder.   Supposedly, the binder allows semi-preserving of the infoset 
while using the JAXB objects.

That said, I've never tried it, particularly with CDATA.  :-)

Dan



On Tuesday, August 02, 2011 6:00:25 PM John Casey wrote:
> Hi all,
> 
> I'm working on some tooling for $dayjob that needs to manipulate POM
> files according to certain rules.
> 
> The problem I'm running into is that some of the POMs it much manipulate
> contain CDATA sections, comments, etc. Also, since the modified POM
> often will be used as the basis for a patch file, I'd like to preserve
> as much of the ordering and existing whitespace in the file as possible,
> to minimize the patchfile size.
> 
> I'm currently using the JDom-driven, Modello-generated writer, coupled
> with the stock XPP3-driven reader (not the best, I know). It's losing
> the CDATA (big problem) and injecting ^M (wrong line ending, little
> problem)...
> 
> Does anyone have experience with this? Anyone maybe have an advanced POM
> reader/writer stashed somewhere that can preserve CDATA and the like?
> 
> Thanks,
> 
> -john
-- 
Daniel Kulp
dkulp@apache.org
http://dankulp.com/blog
Talend - http://www.talend.com

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: POM reader that preserves CDATA?

Posted by Mark Struberg <st...@yahoo.de>.
Don't we need this for the maven-release-manager anyway once we start adding arbitrary stuff?

LieGrue,
strub

--- On Tue, 8/2/11, John Casey <jd...@commonjava.org> wrote:

> From: John Casey <jd...@commonjava.org>
> Subject: Re: POM reader that preserves CDATA?
> To: "Maven Developers List" <de...@maven.apache.org>
> Date: Tuesday, August 2, 2011, 10:22 PM
> 
> 
> On 8/2/11 6:10 PM, Benson Margulies wrote:
> > In general, you really can't expect to retain CDATA
> unless you build a
> > DOM tree, and maybe not then. It's a fundamental
> principle of XML that
> > CDATA isn't part of 'the infoset' -- the data that is
> represented by
> > the file. A parser is under no obligation to
> faithfully report this
> > stuff, so long as the right text ends up in the right
> place.
> >
> > However, forget xpp3 and all that ancient lumber. Use
> Woodstox, or the
> > Xerces DOM, and you can probably find enough options
> to ask it to do
> > what you need.
> 
> yeah, that's basically what I was afraid I'd be left
> with...in other 
> words, no quick and easy solution, but basically a
> reinvention of the 
> model reader with a lot of extras.
> 
> >
> >
> > On Tue, Aug 2, 2011 at 6:00 PM, John Casey<jd...@commonjava.org> 
> wrote:
> >> Hi all,
> >>
> >> I'm working on some tooling for $dayjob that needs
> to manipulate POM files
> >> according to certain rules.
> >>
> >> The problem I'm running into is that some of the
> POMs it much manipulate
> >> contain CDATA sections, comments, etc. Also, since
> the modified POM often
> >> will be used as the basis for a patch file, I'd
> like to preserve as much of
> >> the ordering and existing whitespace in the file
> as possible, to minimize
> >> the patchfile size.
> >>
> >> I'm currently using the JDom-driven,
> Modello-generated writer, coupled with
> >> the stock XPP3-driven reader (not the best, I
> know). It's losing the CDATA
> >> (big problem) and injecting ^M (wrong line ending,
> little problem)...
> >>
> >> Does anyone have experience with this? Anyone
> maybe have an advanced POM
> >> reader/writer stashed somewhere that can preserve
> CDATA and the like?
> >>
> >> Thanks,
> >>
> >> -john
> >>
> >> --
> >> John Casey
> >> Developer, PMC Chair - Apache Maven (http://maven.apache.org)
> >> Blog: http://www.johnofalltrades.name/
> >>
> >>
> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> >> For additional commands, e-mail: dev-help@maven.apache.org
> >>
> >>
> >
> >
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> > For additional commands, e-mail: dev-help@maven.apache.org
> >
> 
> -- 
> John Casey
> Developer, PMC Chair - Apache Maven (http://maven.apache.org)
> Blog: http://www.johnofalltrades.name/
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: POM reader that preserves CDATA?

Posted by John Casey <jd...@commonjava.org>.

On 8/2/11 6:10 PM, Benson Margulies wrote:
> In general, you really can't expect to retain CDATA unless you build a
> DOM tree, and maybe not then. It's a fundamental principle of XML that
> CDATA isn't part of 'the infoset' -- the data that is represented by
> the file. A parser is under no obligation to faithfully report this
> stuff, so long as the right text ends up in the right place.
>
> However, forget xpp3 and all that ancient lumber. Use Woodstox, or the
> Xerces DOM, and you can probably find enough options to ask it to do
> what you need.

yeah, that's basically what I was afraid I'd be left with...in other 
words, no quick and easy solution, but basically a reinvention of the 
model reader with a lot of extras.

>
>
> On Tue, Aug 2, 2011 at 6:00 PM, John Casey<jd...@commonjava.org>  wrote:
>> Hi all,
>>
>> I'm working on some tooling for $dayjob that needs to manipulate POM files
>> according to certain rules.
>>
>> The problem I'm running into is that some of the POMs it much manipulate
>> contain CDATA sections, comments, etc. Also, since the modified POM often
>> will be used as the basis for a patch file, I'd like to preserve as much of
>> the ordering and existing whitespace in the file as possible, to minimize
>> the patchfile size.
>>
>> I'm currently using the JDom-driven, Modello-generated writer, coupled with
>> the stock XPP3-driven reader (not the best, I know). It's losing the CDATA
>> (big problem) and injecting ^M (wrong line ending, little problem)...
>>
>> Does anyone have experience with this? Anyone maybe have an advanced POM
>> reader/writer stashed somewhere that can preserve CDATA and the like?
>>
>> Thanks,
>>
>> -john
>>
>> --
>> John Casey
>> Developer, PMC Chair - Apache Maven (http://maven.apache.org)
>> Blog: http://www.johnofalltrades.name/
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
>> For additional commands, e-mail: dev-help@maven.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org
>

-- 
John Casey
Developer, PMC Chair - Apache Maven (http://maven.apache.org)
Blog: http://www.johnofalltrades.name/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: POM reader that preserves CDATA?

Posted by Benson Margulies <bi...@gmail.com>.
In general, you really can't expect to retain CDATA unless you build a
DOM tree, and maybe not then. It's a fundamental principle of XML that
CDATA isn't part of 'the infoset' -- the data that is represented by
the file. A parser is under no obligation to faithfully report this
stuff, so long as the right text ends up in the right place.

However, forget xpp3 and all that ancient lumber. Use Woodstox, or the
Xerces DOM, and you can probably find enough options to ask it to do
what you need.


On Tue, Aug 2, 2011 at 6:00 PM, John Casey <jd...@commonjava.org> wrote:
> Hi all,
>
> I'm working on some tooling for $dayjob that needs to manipulate POM files
> according to certain rules.
>
> The problem I'm running into is that some of the POMs it much manipulate
> contain CDATA sections, comments, etc. Also, since the modified POM often
> will be used as the basis for a patch file, I'd like to preserve as much of
> the ordering and existing whitespace in the file as possible, to minimize
> the patchfile size.
>
> I'm currently using the JDom-driven, Modello-generated writer, coupled with
> the stock XPP3-driven reader (not the best, I know). It's losing the CDATA
> (big problem) and injecting ^M (wrong line ending, little problem)...
>
> Does anyone have experience with this? Anyone maybe have an advanced POM
> reader/writer stashed somewhere that can preserve CDATA and the like?
>
> Thanks,
>
> -john
>
> --
> John Casey
> Developer, PMC Chair - Apache Maven (http://maven.apache.org)
> Blog: http://www.johnofalltrades.name/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
For additional commands, e-mail: dev-help@maven.apache.org


Re: POM reader that preserves CDATA?

Posted by Stephen Connolly <st...@gmail.com>.
versions-maven-plugin tackles this head-on.

one of these days i'll get some time to finish xevpp.codehaus.org

- Stephen

---
Sent from my Android phone, so random spelling mistakes, random nonsense
words and other nonsense are a direct result of using swype to type on the
screen
On 2 Aug 2011 23:00, "John Casey" <jd...@commonjava.org> wrote:
> Hi all,
>
> I'm working on some tooling for $dayjob that needs to manipulate POM
> files according to certain rules.
>
> The problem I'm running into is that some of the POMs it much manipulate
> contain CDATA sections, comments, etc. Also, since the modified POM
> often will be used as the basis for a patch file, I'd like to preserve
> as much of the ordering and existing whitespace in the file as possible,
> to minimize the patchfile size.
>
> I'm currently using the JDom-driven, Modello-generated writer, coupled
> with the stock XPP3-driven reader (not the best, I know). It's losing
> the CDATA (big problem) and injecting ^M (wrong line ending, little
> problem)...
>
> Does anyone have experience with this? Anyone maybe have an advanced POM
> reader/writer stashed somewhere that can preserve CDATA and the like?
>
> Thanks,
>
> -john
>
> --
> John Casey
> Developer, PMC Chair - Apache Maven (http://maven.apache.org)
> Blog: http://www.johnofalltrades.name/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@maven.apache.org
> For additional commands, e-mail: dev-help@maven.apache.org
>