You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cocoon.apache.org by Stefano Mazzocchi <st...@apache.org> on 2004/02/09 03:18:16 UTC

Cleaning up unused namespace declaration

I'm not normally bugged by namespace declarations which aren't used, 
but boy, something like this just can't go on without me to do 
something about it:

<br xmlns:dir="http://apache.org/cocoon/directory/2.0" 
xmlns:include="http://apache.org/cocoon/include/1.0"/>

[taken from my blog output]

do you have any suggestions on how to use the xml serializer so that it 
stops doing that?

--
Stefano.


Re: Cleaning up unused namespace declaration

Posted by Upayavira <uv...@upaya.co.uk>.
Geoff Howard wrote:

> Vadim Gritsenko wrote:
>
>> Sylvain Wallez wrote:
>>
>>> Geoff Howard wrote:
>>
>>
>> ..
>>
>>>> Couldn't you also configure it to remove all ns without checking if 
>>>> you know your output should have no other namespaces (as usually 
>>>> the case with xhtml)?  That way you could avoid buffering in that 
>>>> special but common case.
>>>
>>>
>>> We can filter out all namespaces in HTML, but not in XHTML, as we 
>>> can have composite documents with foreing markup.
>>
>>
>> Geoff refers to *common case*, which does not have any foreign 
>> markup, and you know this in advance. In this common case, it makes 
>> perfect sense to save CPU cycles and dump all but xhtml namespaces, 
>> and have configuration parameter like
>>    leave-one-ns-namespace-but-drop-all-others="http://.../xhtml"
>
Having done that, why not just allow an arbitrary number of namespaces 
to be specified? That way, assuming you know your namespaces, you can 
have them passed through. This I would say is likely in the majority of 
situations. Namespaces not recognised will be stripped.

Upayavira



Re: Cleaning up unused namespace declaration

Posted by Geoff Howard <co...@leverageweb.com>.
Vadim Gritsenko wrote:

> Sylvain Wallez wrote:
> 
>> Geoff Howard wrote:
> 
> ..
> 
>>> Couldn't you also configure it to remove all ns without checking if 
>>> you know your output should have no other namespaces (as usually the 
>>> case with xhtml)?  That way you could avoid buffering in that special 
>>> but common case.
>>
>> We can filter out all namespaces in HTML, but not in XHTML, as we can 
>> have composite documents with foreing markup.
> 
> Geoff refers to *common case*, which does not have any foreign markup, 
> and you know this in advance. In this common case, it makes perfect 
> sense to save CPU cycles and dump all but xhtml namespaces, and have 
> configuration parameter like
>    leave-one-ns-namespace-but-drop-all-others="http://.../xhtml"
> 
> For non-common scenarios, with different embedded namespaces etc, more 
> complicated handling should be used, total agreement here.
> 
> For common-case HTML serialization, all namespaces could be dropped, 
> including xhtml.

Exactly what I meant but didn't say clearly.

Geoff


Re: Cleaning up unused namespace declaration

Posted by Vadim Gritsenko <va...@reverycodes.com>.
Sylvain Wallez wrote:

> Geoff Howard wrote:

..

>> Couldn't you also configure it to remove all ns without checking if 
>> you know your output should have no other namespaces (as usually the 
>> case with xhtml)?  That way you could avoid buffering in that special 
>> but common case.
>
>
> We can filter out all namespaces in HTML, but not in XHTML, as we can 
> have composite documents with foreing markup.


Geoff refers to *common case*, which does not have any foreign markup, 
and you know this in advance. In this common case, it makes perfect 
sense to save CPU cycles and dump all but xhtml namespaces, and have 
configuration parameter like
    leave-one-ns-namespace-but-drop-all-others="http://.../xhtml"

For non-common scenarios, with different embedded namespaces etc, more 
complicated handling should be used, total agreement here.

For common-case HTML serialization, all namespaces could be dropped, 
including xhtml.

Vadim


Re: Cleaning up unused namespace declaration

Posted by Sylvain Wallez <sy...@apache.org>.
Geoff Howard wrote:

> Stefano Mazzocchi wrote:
>
>> On 9 Feb 2004, at 04:40, Sylvain Wallez wrote:
>
>>> A long time ago, I wrote a NamespaceNormalizer that works in two steps:
>>> 1 - record all SAX events (using XMLByteStreamCompiler) except 
>>> namespace declarations which are stored separately,
>>> 2 - replay SAX events, declaring all namespaces before the root 
>>> element.
>>>
>>> This leads to buffering the document, which has a cost, but wouldn't 
>>> hurt when the pipeline can be cached.
>>>
>>> We could add this as an optional feature of the XMLSerializer.
>>>
>>> WDYT?
>>
>>
>> I think it would *rock*! I wouldn't mind paying the cost of such a 
>> buffering since I'm going to cache this anyway.
>
>
> Couldn't you also configure it to remove all ns without checking if 
> you know your output should have no other namespaces (as usually the 
> case with xhtml)?  That way you could avoid buffering in that special 
> but common case.


We can filter out all namespaces in HTML, but not in XHTML, as we can 
have composite documents with foreing markup.

Sylvain

-- 
Sylvain Wallez                                  Anyware Technologies
http://www.apache.org/~sylvain           http://www.anyware-tech.com
{ XML, Java, Cocoon, OpenSource }*{ Training, Consulting, Projects }
Orixo, the opensource XML business alliance  -  http://www.orixo.com



Re: Cleaning up unused namespace declaration

Posted by Geoff Howard <co...@leverageweb.com>.
Stefano Mazzocchi wrote:

> 
> On 9 Feb 2004, at 04:40, Sylvain Wallez wrote:
> 
>> Stefano Mazzocchi wrote:
>>
>>> I'm not normally bugged by namespace declarations which aren't used, 
>>> but boy, something like this just can't go on without me to do 
>>> something about it:
>>>
>>> <br xmlns:dir="http://apache.org/cocoon/directory/2.0" 
>>> xmlns:include="http://apache.org/cocoon/include/1.0"/>
>>>
>>> [taken from my blog output]
>>>
>>> do you have any suggestions on how to use the xml serializer so that 
>>> it stops doing that?
>>
>>
>>
>> A long time ago, I wrote a NamespaceNormalizer that works in two steps:
>> 1 - record all SAX events (using XMLByteStreamCompiler) except 
>> namespace declarations which are stored separately,
>> 2 - replay SAX events, declaring all namespaces before the root element.
>>
>> This leads to buffering the document, which has a cost, but wouldn't 
>> hurt when the pipeline can be cached.
>>
>> We could add this as an optional feature of the XMLSerializer.
>>
>> WDYT?
> 
> 
> I think it would *rock*! I wouldn't mind paying the cost of such a 
> buffering since I'm going to cache this anyway.

Couldn't you also configure it to remove all ns without checking if you 
know your output should have no other namespaces (as usually the case 
with xhtml)?  That way you could avoid buffering in that special but 
common case.

Geoff


Re: Cleaning up unused namespace declaration

Posted by Sylvain Wallez <sy...@apache.org>.
Stefano Mazzocchi wrote:

>
> On 9 Feb 2004, at 04:40, Sylvain Wallez wrote:
>
>> Stefano Mazzocchi wrote:
>>
>>> I'm not normally bugged by namespace declarations which aren't used, 
>>> but boy, something like this just can't go on without me to do 
>>> something about it:
>>>
>>> <br xmlns:dir="http://apache.org/cocoon/directory/2.0" 
>>> xmlns:include="http://apache.org/cocoon/include/1.0"/>
>>>
>>> [taken from my blog output]
>>>
>>> do you have any suggestions on how to use the xml serializer so that 
>>> it stops doing that?
>>
>>
>> A long time ago, I wrote a NamespaceNormalizer that works in two steps:
>> 1 - record all SAX events (using XMLByteStreamCompiler) except 
>> namespace declarations which are stored separately,
>> 2 - replay SAX events, declaring all namespaces before the root element.
>>
>> This leads to buffering the document, which has a cost, but wouldn't 
>> hurt when the pipeline can be cached.
>>
>> We could add this as an optional feature of the XMLSerializer.
>>
>> WDYT?
>
>
> I think it would *rock*! I wouldn't mind paying the cost of such a 
> buffering since I'm going to cache this anyway.


I'll dig into my HD (which I'll have to do anyway when switching to the 
PB, hehe) to find it and add it after the release.

Sylvain

-- 
Sylvain Wallez                                  Anyware Technologies
http://www.apache.org/~sylvain           http://www.anyware-tech.com
{ XML, Java, Cocoon, OpenSource }*{ Training, Consulting, Projects }
Orixo, the opensource XML business alliance  -  http://www.orixo.com



Re: Cleaning up unused namespace declaration

Posted by Stefano Mazzocchi <st...@apache.org>.
On 9 Feb 2004, at 04:40, Sylvain Wallez wrote:

> Stefano Mazzocchi wrote:
>
>> I'm not normally bugged by namespace declarations which aren't used, 
>> but boy, something like this just can't go on without me to do 
>> something about it:
>>
>> <br xmlns:dir="http://apache.org/cocoon/directory/2.0" 
>> xmlns:include="http://apache.org/cocoon/include/1.0"/>
>>
>> [taken from my blog output]
>>
>> do you have any suggestions on how to use the xml serializer so that 
>> it stops doing that?
>
>
> A long time ago, I wrote a NamespaceNormalizer that works in two steps:
> 1 - record all SAX events (using XMLByteStreamCompiler) except 
> namespace declarations which are stored separately,
> 2 - replay SAX events, declaring all namespaces before the root 
> element.
>
> This leads to buffering the document, which has a cost, but wouldn't 
> hurt when the pipeline can be cached.
>
> We could add this as an optional feature of the XMLSerializer.
>
> WDYT?

I think it would *rock*! I wouldn't mind paying the cost of such a 
buffering since I'm going to cache this anyway.

--
Stefano.


Re: Cleaning up unused namespace declaration

Posted by Sylvain Wallez <sy...@apache.org>.
Stefano Mazzocchi wrote:

> I'm not normally bugged by namespace declarations which aren't used, 
> but boy, something like this just can't go on without me to do 
> something about it:
>
> <br xmlns:dir="http://apache.org/cocoon/directory/2.0" 
> xmlns:include="http://apache.org/cocoon/include/1.0"/>
>
> [taken from my blog output]
>
> do you have any suggestions on how to use the xml serializer so that 
> it stops doing that?


A long time ago, I wrote a NamespaceNormalizer that works in two steps:
1 - record all SAX events (using XMLByteStreamCompiler) except namespace 
declarations which are stored separately,
2 - replay SAX events, declaring all namespaces before the root element.

This leads to buffering the document, which has a cost, but wouldn't 
hurt when the pipeline can be cached.

We could add this as an optional feature of the XMLSerializer.

WDYT?

Sylvain


-- 
Sylvain Wallez                                  Anyware Technologies
http://www.apache.org/~sylvain           http://www.anyware-tech.com
{ XML, Java, Cocoon, OpenSource }*{ Training, Consulting, Projects }
Orixo, the opensource XML business alliance  -  http://www.orixo.com



Re: Cleaning up unused namespace declaration

Posted by Stefano Mazzocchi <st...@apache.org>.
On 9 Feb 2004, at 03:16, Ugo Cei wrote:

> Stefano Mazzocchi wrote:
>> Yeah, well, that doesn't help me because I have the namespace 
>> declarations already there in the document I want to process and it 
>> appears that xsl:copy copies over the namespace declarations 
>> everytime and it's not influenced by exclude-result-prefixes.
>
>   <!--
>      - Namespace removing templates
>      - from http://outerthought.net/wiki/Wiki.jsp?page=RemoveNamespaces

uh, wow, the wiki contains all sort of things! gosh, too bad that it's 
like hidden overthere :-/

>   -->
>   <xsl:template match="*" priority="-1" mode="copy">
>     <xsl:element name="{local-name()}">
>       <xsl:copy-of select="@*"/>
>       <xsl:apply-templates mode="copy"/>
>     </xsl:element>
>   </xsl:template>
>
>   <xsl:template match="text()" mode="copy">
>     <xsl:value-of select="."/>
>   </xsl:template>

yeah, I ended up doing something similar, thanks.

--
Stefano.


Re: Cleaning up unused namespace declaration

Posted by Ugo Cei <u....@cbim.it>.
Stefano Mazzocchi wrote:
> Yeah, well, that doesn't help me because I have the namespace 
> declarations already there in the document I want to process and it 
> appears that xsl:copy copies over the namespace declarations everytime 
> and it's not influenced by exclude-result-prefixes.

   <!--
      - Namespace removing templates
      - from http://outerthought.net/wiki/Wiki.jsp?page=RemoveNamespaces
   -->
   <xsl:template match="*" priority="-1" mode="copy">
     <xsl:element name="{local-name()}">
       <xsl:copy-of select="@*"/>
       <xsl:apply-templates mode="copy"/>
     </xsl:element>
   </xsl:template>

   <xsl:template match="text()" mode="copy">
     <xsl:value-of select="."/>
   </xsl:template>

HTH,

		Ugo



RE: Cleaning up unused namespace declaration

Posted by Conal Tuohy <co...@paradise.net.nz>.
It's true that xsl:copy copies namespace declarations that are in scope. But
how do you have html elements inside the scope of a dir:directory element?
Are you using the XPathDirectoryGenerator? If so, or if you've transformed
the dir:file elements into inclusions, etc, then you might want to transform
the enclosing dir:file elements into xhtml at the same time. e.g. something
like:

<xsl:transform ... exclude-result-prefixes="dir">

	<xsl:template match="dir:file">
		<div class="file-content" id="{@name}">
			<xi:include href="@name"/>
		</div>
	</xsl:template>

	...

</xsl:transform>

Then when you come to xsl:copy the content in a later stage of the pipeline,
you won't have the dir namespace in scope any more.

Con


> -----Original Message-----
> From: Stefano Mazzocchi [mailto:stefano@apache.org]
> Sent: Monday, 9 February 2004 16:18
> To: dev@cocoon.apache.org
> Subject: Re: Cleaning up unused namespace declaration
>
>
>
> On 8 Feb 2004, at 21:37, Conal Tuohy wrote:
>
> > I don't know if you can configure the xml serializer to drop a
> > namespace
> > (seems unlikely, because such namespace might not be used until the
> > end of
> > the document, for all the serializer knows, so it wouldn't be safe
> > without
> > buffering the entire output document to check).
> >
> > But typically you should suppress the namespace in the XSLT which
> > converts
> > the dir:* content into xhtml, using the exclude-result-prefixes
> > attribute of
> > the xsl:stylesheet or xsl:transform element. This works for
> me - I just
> > checked! :-)
>
> Yeah, well, that doesn't help me because I have the namespace
> declarations already there in the document I want to process and it
> appears that xsl:copy copies over the namespace declarations
> everytime
> and it's not influenced by exclude-result-prefixes.
>
> And this isn't true if you use <xsl:element name="{name()}">, which
> feels hacky, but what the hell.
>
> [read http://www.xslt.com/xsl-list/2002-02/msg00026.html for
> more info]
>
> --
> Stefano.
>
>


Re: Cleaning up unused namespace declaration

Posted by Stefano Mazzocchi <st...@apache.org>.
On 8 Feb 2004, at 21:37, Conal Tuohy wrote:

> I don't know if you can configure the xml serializer to drop a 
> namespace
> (seems unlikely, because such namespace might not be used until the 
> end of
> the document, for all the serializer knows, so it wouldn't be safe 
> without
> buffering the entire output document to check).
>
> But typically you should suppress the namespace in the XSLT which 
> converts
> the dir:* content into xhtml, using the exclude-result-prefixes 
> attribute of
> the xsl:stylesheet or xsl:transform element. This works for me - I just
> checked! :-)

Yeah, well, that doesn't help me because I have the namespace 
declarations already there in the document I want to process and it 
appears that xsl:copy copies over the namespace declarations everytime 
and it's not influenced by exclude-result-prefixes.

And this isn't true if you use <xsl:element name="{name()}">, which 
feels hacky, but what the hell.

[read http://www.xslt.com/xsl-list/2002-02/msg00026.html for more info]

--
Stefano.


RE: Cleaning up unused namespace declaration

Posted by Conal Tuohy <co...@paradise.net.nz>.
I don't know if you can configure the xml serializer to drop a namespace
(seems unlikely, because such namespace might not be used until the end of
the document, for all the serializer knows, so it wouldn't be safe without
buffering the entire output document to check).

But typically you should suppress the namespace in the XSLT which converts
the dir:* content into xhtml, using the exclude-result-prefixes attribute of
the xsl:stylesheet or xsl:transform element. This works for me - I just
checked! :-)

Con

> -----Original Message-----
> From: Stefano Mazzocchi [mailto:stefano@apache.org]
> Sent: Monday, 9 February 2004 15:18
> To: Cocoon
> Subject: Cleaning up unused namespace declaration
>
>
> I'm not normally bugged by namespace declarations which aren't used,
> but boy, something like this just can't go on without me to do
> something about it:
>
> <br xmlns:dir="http://apache.org/cocoon/directory/2.0"
> xmlns:include="http://apache.org/cocoon/include/1.0"/>
>
> [taken from my blog output]
>
> do you have any suggestions on how to use the xml serializer
> so that it
> stops doing that?
>
> --
> Stefano.
>
>