You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@groovy.apache.org by "Winnebeck, Jason" <Ja...@windstream.com> on 2016/03/01 21:02:53 UTC

XmlSlurper, attributes and namespaces

I've been struggling for a long time with XmlSlurper in how I can read attributes. Unfortunately, I have an XML document with namespaces. I wish I could ignore the namespaces, but while XmlSlurper no-arg constructor says it ignores namespaces, it does not. When using two-arg constructor to set namespaceAware to false it just asks like namespaced elements are deleted. So I'm trying to figure out how to specify namespaces:

import groovy.xml.*

def text = """<x:root xmlns:x="blah">
  <x:child x:id='1'>c</x:child>
</x:root>"""

def xml =
    new XmlSlurper() 
        .parseText(text)
        .declareNamespace(x:'blah')
//        .declareNamespace(t:'blah')

println xml.child.text()     //"c" always
println xml.'x:child'.text() //"c" when declareNamespace x
println xml.'t:child'.text() //"c" when declareNamespace t
println xml.child.'@x:id'    //"1" always
println xml.child.'@t:id'    //"" always

It appears that specifying namespace is optional on elements and also declareNamespace affects how I find the elements when they do have namespaces. For attributes, declareNamespace appears to have no effect, and I need to specify the prefix as it is specified in the file itself. The problem is that the generator gets to specify any prefix they want. How can I get the "id" attribute on the "child" element regardless of the namespace prefix used? (A solution dropping all namespaces is fine as there is only one namespace and no collisions).

Thanks,
Jason Winnebeck

----------------------------------------------------------------------
This email message and any attachments are for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message and any attachments.

RE: XmlSlurper, attributes and namespaces

Posted by "Winnebeck, Jason" <Ja...@windstream.com>.
Ok I created https://issues.apache.org/jira/browse/GROOVY-7781

One workaround I wanted to try but never could find -- I see no way in GPathResult to iterate over the attributes or list the attributes. Since my document is entirely within a single namespace I was hoping that I could search for the attributes by their name, excluding the database. But when I use find method in GPathResult it appears to only return elements and text content and not attributes.

Jason

-----Original Message-----
From: Paul King [mailto:paulk@asert.com.au] 
Sent: Sunday, March 06, 2016 6:20 AM
To: users@groovy.apache.org
Subject: Re: XmlSlurper, attributes and namespaces

Yes, I think it is a bug. I thought we had a workaround using star, i.e. node.@'*:attributeName', much like node.'*:tagName' but the attribute version with star doesn't work either.


On Sat, Mar 5, 2016 at 7:43 PM, Pascal Schumacher <pa...@gmx.net> wrote:
> Hi Jason,
>
> I do not know. It would be nice if you would create a jira issue for this.
>
> Thanks,
> Pascal
>
>
> Am 01.03.2016 um 21:21 schrieb Winnebeck, Jason:
>>
>> I can at least give the technical reason why this doesn't work -- 
>> there is namespaceMap and namespaceTagHints in GPathResult. 
>> namespaceMap is updated by declareNamespace but namespaceTagHints is 
>> not. I don't see a way to update namespaceTagHints and namespaceMap 
>> doesn't really even seem to be used. This seems like a bug in GPathResult?
>>
>> Jason
>>
>> -----Original Message-----
>> From: Winnebeck, Jason [mailto:Jason.Winnebeck@windstream.com]
>> Sent: Tuesday, March 01, 2016 3:03 PM
>> To: users@groovy.apache.org
>> Subject: XmlSlurper, attributes and namespaces
>>
>> I've been struggling for a long time with XmlSlurper in how I can 
>> read attributes. Unfortunately, I have an XML document with 
>> namespaces. I wish I could ignore the namespaces, but while 
>> XmlSlurper no-arg constructor says it ignores namespaces, it does 
>> not. When using two-arg constructor to set namespaceAware to false it just asks like namespaced elements are deleted.
>> So I'm trying to figure out how to specify namespaces:
>>
>> import groovy.xml.*
>>
>> def text = """<x:root xmlns:x="blah">
>>    <x:child x:id='1'>c</x:child>
>> </x:root>"""
>>
>> def xml =
>>      new XmlSlurper()
>>          .parseText(text)
>>          .declareNamespace(x:'blah')
>> //        .declareNamespace(t:'blah')
>>
>> println xml.child.text()     //"c" always
>> println xml.'x:child'.text() //"c" when declareNamespace x println
>> xml.'t:child'.text() //"c" when declareNamespace t
>> println xml.child.'@x:id'    //"1" always
>> println xml.child.'@t:id'    //"" always
>>
>> It appears that specifying namespace is optional on elements and also 
>> declareNamespace affects how I find the elements when they do have 
>> namespaces. For attributes, declareNamespace appears to have no 
>> effect, and I need to specify the prefix as it is specified in the 
>> file itself. The problem is that the generator gets to specify any 
>> prefix they want. How can I get the "id" attribute on the "child" 
>> element regardless of the namespace prefix used? (A solution dropping 
>> all namespaces is fine as there is only one namespace and no collisions).
>>
>> Thanks,
>> Jason Winnebeck
>>
>> ---------------------------------------------------------------------
>> - This email message and any attachments are for the sole use of the 
>> intended recipient(s). Any unauthorized review, use, disclosure or 
>> distribution is prohibited. If you are not the intended recipient, 
>> please contact the sender by reply email and destroy all copies of 
>> the original message and any attachments.
>
>

----------------------------------------------------------------------
This email message and any attachments are for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message and any attachments.

Re: XmlSlurper, attributes and namespaces

Posted by Paul King <pa...@asert.com.au>.
Yes, I think it is a bug. I thought we had a workaround using star,
i.e. node.@'*:attributeName', much like node.'*:tagName' but the
attribute version with star doesn't work either.


On Sat, Mar 5, 2016 at 7:43 PM, Pascal Schumacher
<pa...@gmx.net> wrote:
> Hi Jason,
>
> I do not know. It would be nice if you would create a jira issue for this.
>
> Thanks,
> Pascal
>
>
> Am 01.03.2016 um 21:21 schrieb Winnebeck, Jason:
>>
>> I can at least give the technical reason why this doesn't work -- there is
>> namespaceMap and namespaceTagHints in GPathResult. namespaceMap is updated
>> by declareNamespace but namespaceTagHints is not. I don't see a way to
>> update namespaceTagHints and namespaceMap doesn't really even seem to be
>> used. This seems like a bug in GPathResult?
>>
>> Jason
>>
>> -----Original Message-----
>> From: Winnebeck, Jason [mailto:Jason.Winnebeck@windstream.com]
>> Sent: Tuesday, March 01, 2016 3:03 PM
>> To: users@groovy.apache.org
>> Subject: XmlSlurper, attributes and namespaces
>>
>> I've been struggling for a long time with XmlSlurper in how I can read
>> attributes. Unfortunately, I have an XML document with namespaces. I wish I
>> could ignore the namespaces, but while XmlSlurper no-arg constructor says it
>> ignores namespaces, it does not. When using two-arg constructor to set
>> namespaceAware to false it just asks like namespaced elements are deleted.
>> So I'm trying to figure out how to specify namespaces:
>>
>> import groovy.xml.*
>>
>> def text = """<x:root xmlns:x="blah">
>>    <x:child x:id='1'>c</x:child>
>> </x:root>"""
>>
>> def xml =
>>      new XmlSlurper()
>>          .parseText(text)
>>          .declareNamespace(x:'blah')
>> //        .declareNamespace(t:'blah')
>>
>> println xml.child.text()     //"c" always
>> println xml.'x:child'.text() //"c" when declareNamespace x println
>> xml.'t:child'.text() //"c" when declareNamespace t
>> println xml.child.'@x:id'    //"1" always
>> println xml.child.'@t:id'    //"" always
>>
>> It appears that specifying namespace is optional on elements and also
>> declareNamespace affects how I find the elements when they do have
>> namespaces. For attributes, declareNamespace appears to have no effect, and
>> I need to specify the prefix as it is specified in the file itself. The
>> problem is that the generator gets to specify any prefix they want. How can
>> I get the "id" attribute on the "child" element regardless of the namespace
>> prefix used? (A solution dropping all namespaces is fine as there is only
>> one namespace and no collisions).
>>
>> Thanks,
>> Jason Winnebeck
>>
>> ----------------------------------------------------------------------
>> This email message and any attachments are for the sole use of the
>> intended recipient(s). Any unauthorized review, use, disclosure or
>> distribution is prohibited. If you are not the intended recipient, please
>> contact the sender by reply email and destroy all copies of the original
>> message and any attachments.
>
>

Re: XmlSlurper, attributes and namespaces

Posted by Pascal Schumacher <pa...@gmx.net>.
Hi Jason,

I do not know. It would be nice if you would create a jira issue for this.

Thanks,
Pascal

Am 01.03.2016 um 21:21 schrieb Winnebeck, Jason:
> I can at least give the technical reason why this doesn't work -- there is namespaceMap and namespaceTagHints in GPathResult. namespaceMap is updated by declareNamespace but namespaceTagHints is not. I don't see a way to update namespaceTagHints and namespaceMap doesn't really even seem to be used. This seems like a bug in GPathResult?
>
> Jason
>
> -----Original Message-----
> From: Winnebeck, Jason [mailto:Jason.Winnebeck@windstream.com]
> Sent: Tuesday, March 01, 2016 3:03 PM
> To: users@groovy.apache.org
> Subject: XmlSlurper, attributes and namespaces
>
> I've been struggling for a long time with XmlSlurper in how I can read attributes. Unfortunately, I have an XML document with namespaces. I wish I could ignore the namespaces, but while XmlSlurper no-arg constructor says it ignores namespaces, it does not. When using two-arg constructor to set namespaceAware to false it just asks like namespaced elements are deleted. So I'm trying to figure out how to specify namespaces:
>
> import groovy.xml.*
>
> def text = """<x:root xmlns:x="blah">
>    <x:child x:id='1'>c</x:child>
> </x:root>"""
>
> def xml =
>      new XmlSlurper()
>          .parseText(text)
>          .declareNamespace(x:'blah')
> //        .declareNamespace(t:'blah')
>
> println xml.child.text()     //"c" always
> println xml.'x:child'.text() //"c" when declareNamespace x println xml.'t:child'.text() //"c" when declareNamespace t
> println xml.child.'@x:id'    //"1" always
> println xml.child.'@t:id'    //"" always
>
> It appears that specifying namespace is optional on elements and also declareNamespace affects how I find the elements when they do have namespaces. For attributes, declareNamespace appears to have no effect, and I need to specify the prefix as it is specified in the file itself. The problem is that the generator gets to specify any prefix they want. How can I get the "id" attribute on the "child" element regardless of the namespace prefix used? (A solution dropping all namespaces is fine as there is only one namespace and no collisions).
>
> Thanks,
> Jason Winnebeck
>
> ----------------------------------------------------------------------
> This email message and any attachments are for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message and any attachments.


RE: XmlSlurper, attributes and namespaces

Posted by "Winnebeck, Jason" <Ja...@windstream.com>.
I can at least give the technical reason why this doesn't work -- there is namespaceMap and namespaceTagHints in GPathResult. namespaceMap is updated by declareNamespace but namespaceTagHints is not. I don't see a way to update namespaceTagHints and namespaceMap doesn't really even seem to be used. This seems like a bug in GPathResult?

Jason

-----Original Message-----
From: Winnebeck, Jason [mailto:Jason.Winnebeck@windstream.com] 
Sent: Tuesday, March 01, 2016 3:03 PM
To: users@groovy.apache.org
Subject: XmlSlurper, attributes and namespaces

I've been struggling for a long time with XmlSlurper in how I can read attributes. Unfortunately, I have an XML document with namespaces. I wish I could ignore the namespaces, but while XmlSlurper no-arg constructor says it ignores namespaces, it does not. When using two-arg constructor to set namespaceAware to false it just asks like namespaced elements are deleted. So I'm trying to figure out how to specify namespaces:

import groovy.xml.*

def text = """<x:root xmlns:x="blah">
  <x:child x:id='1'>c</x:child>
</x:root>"""

def xml =
    new XmlSlurper() 
        .parseText(text)
        .declareNamespace(x:'blah')
//        .declareNamespace(t:'blah')

println xml.child.text()     //"c" always
println xml.'x:child'.text() //"c" when declareNamespace x println xml.'t:child'.text() //"c" when declareNamespace t
println xml.child.'@x:id'    //"1" always
println xml.child.'@t:id'    //"" always

It appears that specifying namespace is optional on elements and also declareNamespace affects how I find the elements when they do have namespaces. For attributes, declareNamespace appears to have no effect, and I need to specify the prefix as it is specified in the file itself. The problem is that the generator gets to specify any prefix they want. How can I get the "id" attribute on the "child" element regardless of the namespace prefix used? (A solution dropping all namespaces is fine as there is only one namespace and no collisions).

Thanks,
Jason Winnebeck

----------------------------------------------------------------------
This email message and any attachments are for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message and any attachments.