You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cocoon.apache.org by Vadim Gritsenko <va...@verizon.net> on 2003/08/16 01:59:29 UTC

Re: HTMLGenerator

Joerg Heinicke wrote:

> Specifiying the XPath exactly 
> (/html[1]/body[1]/center[1]/table[4]/tr[1]/td[1]) works. Should be a 
> good hint for the Xalan team, shouldn't it? 


I'd added into XPathTestCase.java in avalon-excalibur/xmlutil:

            // 9. Test
            expr = 
"/test:root/test:element2/test:table[1]/test:tr/test:td[4]";
            node = processor.selectSingleNode(document1, expr);
            assertNotNull("Must select <test:td/> node, but got null", 
node);
            assertEquals("Must select <test:td/> node", 
Node.ELEMENT_NODE, node.getNodeType());
            assertEquals("Must select <test:td/> node", "td", 
node.getLocalName());

And:

    static final String CONTENT1 =
        "<?xml version=\"1.0\"?>" +
        "<test:root xmlns:test=\"http://localhost/test\">" +
            "<test:element1/>" +
            "<test:element2>" +
              "<test:table>" +
              "  <test:tr>" +
              "    <test:td>table1 tr1 td1</test:td>" +
              "    <test:td>table1 tr1 td2</test:td>" +
              "    <test:td>table1 tr1 td3</test:td>" +
              "  </test:tr>" +
              "  <test:tr>" +
              "    <test:td>table1 tr2 td1</test:td>" +
              "    <test:td>table1 tr2 td2</test:td>" +
              "    <test:td>table1 tr2 td3</test:td>" +
              "    <test:td>table1 tr2 td4</test:td>" +
              "  </test:tr>" +
              "  <test:tr>" +
              "    <test:td>table1 tr3 td1</test:td>" +
              "    <test:td>table1 tr3 td2</test:td>" +
              "    <test:td>table1 tr3 td3</test:td>" +
              "  </test:tr>" +
              "</test:table>" +
              "<test:table>" +
              "  <test:tr>" +
              "    <test:td>table2 tr1 td1</test:td>" +
              "    <test:td>table2 tr1 td2</test:td>" +
              "    <test:td>table2 tr1 td3</test:td>" +
              "  </test:tr>" +
              "  <test:tr>" +
              "    <test:td>table2 tr2 td1</test:td>" +
              "    <test:td>table2 tr2 td2</test:td>" +
              "    <test:td>table2 tr2 td3</test:td>" +
              "  </test:tr>" +
              "  <test:tr>" +
              "    <test:td>table2 tr3 td1</test:td>" +
              "    <test:td>table2 tr3 td2</test:td>" +
              "    <test:td>table2 tr3 td3</test:td>" +
              "  </test:tr>" +
              "</test:table>" +
            "</test:element2>" +
        "</test:root>";

Still works :-/
Can you reproduce this using excalibur's test case ("ant test" in 
avalon-excalibur/xmlutil)?

Vadim



> Joerg
>
>
> Joerg Heinicke wrote:
>
>> Vadim Gritsenko wrote:
>>
>>>> Yahoo screenscrape example was not working for long time now. I had 
>>>> not had a chance to find a reason why. Search sample sitemaps, I 
>>>> had included yahoo screenscrape pipeline somewhere..... Oops, it's 
>>>> gone.
>>>
>>>
>>>
>>> Found it in blocks/html/samples/sitemap.xmap. Does not work:
>>
>>
>>
>> That happens only if used with the xpath parameter. I debugged a bit 
>> as far as possible (is it possible to tell Eclipse not to use the 
>> Xalan sources out of JDK's src.zip, but from the path I'm telling 
>> it?) and I found out:
>> org.apache.excalibur.xml.xpath.XPathProcessorImpl:265
>>
>> final XObject result =
>>          XPathAPI.eval(contextNode, str, new XalanResolver(resolver));
>>
>> works, but the next line
>>
>> result.nodelist()
>>
>> (converting the XNodeSet into a nodelist) results in the 
>> ArrayIndexOutOfBoundsException.
>>
>> Found no relating bug in bugzilla.
>>
>> Tried it with Xalan jar built from CVS and it did neither work.
>>
>> Joerg
>



Re: HTMLGenerator

Posted by Geoff Howard <co...@leverageweb.com>.
Steven Noels wrote:
> Vadim Gritsenko wrote:
> 
>> Switch to google *immediately*! Say, http://directory.google.com/ -- 
>> that's what we were pulling from yahoo
> 
> 
> What about http://news.google.com/news/en/us/technology.html
> 
> Should be less static.
> 
> </Steven>

That's my favorite one so far.

Geoff


Re: HTMLGenerator

Posted by Steven Noels <st...@outerthought.org>.
Vadim Gritsenko wrote:

> Switch to google *immediately*! Say, http://directory.google.com/ -- 
> that's what we were pulling from yahoo

What about http://news.google.com/news/en/us/technology.html

Should be less static.

</Steven>
-- 
Steven Noels                            http://outerthought.org/
Outerthought - Open Source, Java & XML Competence Support Center
Read my weblog at            http://blogs.cocoondev.org/stevenn/
stevenn at outerthought.org                stevenn at apache.org


Re: HTMLGenerator

Posted by Vadim Gritsenko <va...@verizon.net>.
Geoff Howard wrote:

> Joerg Heinicke wrote:
>
> ...
>
>> GOT IT!!!
>>
>> What ever node type this should be: <![if !IE]> <![endif]>
>> this is the reason for the exception. Why the explicite XPath works? 
>> I don't know ... At other places in the same file these constructs 
>> are placed in comments and don't disturb: <!--[if IE]><![endif]--> To 
>> which language do these constructs belong?
>
>
> You mean you don't use the "Downlevel-revealed Conditional Comment" 
> and the "Downlevel-hidden Conditional Comment" every day?  When I 
> first ran into this I was quite confused but google came to the rescue:
> http://msdn.microsoft.com/workshop/author/dhtml/overview/ccomment_ovw.asp
>
> Basically, IIRC it's a  microsoft extension to html which is a 
> conditional evaluated by the browser.  The two have different meanings 
> that I haven't quite taken the time to grasp.  Do they use it on every 
> page?  How about switching to google or cnn? 


EEEEEEEEEEWWWWWWWW!!!!!!!!!

Switch to google *immediately*! Say, http://directory.google.com/ -- 
that's what we were pulling from yahoo


Vadim



Re: HTMLGenerator

Posted by Joerg Heinicke <jo...@gmx.de>.
Geoff Howard wrote:

>> GOT IT!!!
>>
>> What ever node type this should be: <![if !IE]> <![endif]>
>> this is the reason for the exception. Why the explicite XPath works? I 
>> don't know ... At other places in the same file these constructs are 
>> placed in comments and don't disturb: <!--[if IE]><![endif]--> To 
>> which language do these constructs belong?
> 
> 
> You mean you don't use the "Downlevel-revealed Conditional Comment" and 
> the "Downlevel-hidden Conditional Comment" every day?  When I first ran 
> into this I was quite confused but google came to the rescue:
> http://msdn.microsoft.com/workshop/author/dhtml/overview/ccomment_ovw.asp

No, never - and I'm sure I will never!! If I need browser specific code, 
I will use Cocoon with browser selector and XSLT :)

> Basically, IIRC it's a  microsoft extension to html which is a 
> conditional evaluated by the browser.  The two have different meanings 
> that I haven't quite taken the time to grasp.  Do they use it on every 
> page?  How about switching to google or cnn?

Don't know if on every page, but they *do* use it and this can always 
break our sample. I have already a simple list for 
http://www.theserverside.com running. (Topnews is Cocoon 2.1 at the 
moment :) If there is any interest, I can style it tomorrow. Of course 
we can switch to any news site, maybe others have better suggestions? 
Apache sites are to boring for this purpose, aren't they?

Joerg


Re: HTMLGenerator

Posted by Geoff Howard <co...@leverageweb.com>.
Joerg Heinicke wrote:

...

> GOT IT!!!
> 
> What ever node type this should be: <![if !IE]> <![endif]>
> this is the reason for the exception. Why the explicite XPath works? I 
> don't know ... At other places in the same file these constructs are 
> placed in comments and don't disturb: <!--[if IE]><![endif]--> To which 
> language do these constructs belong?

You mean you don't use the "Downlevel-revealed Conditional Comment" and 
the "Downlevel-hidden Conditional Comment" every day?  When I first ran 
into this I was quite confused but google came to the rescue:
http://msdn.microsoft.com/workshop/author/dhtml/overview/ccomment_ovw.asp

Basically, IIRC it's a  microsoft extension to html which is a 
conditional evaluated by the browser.  The two have different meanings 
that I haven't quite taken the time to grasp.  Do they use it on every 
page?  How about switching to google or cnn?

Geoff


Re: HTMLGenerator

Posted by Joerg Heinicke <jo...@gmx.de>.
Vadim Gritsenko wrote:

> Joerg Heinicke wrote:
> 
>> Specifiying the XPath exactly 
>> (/html[1]/body[1]/center[1]/table[4]/tr[1]/td[1]) works. Should be a 
>> good hint for the Xalan team, shouldn't it? 
> 
> 
> 
> I'd added into XPathTestCase.java in avalon-excalibur/xmlutil:

...

> Still works :-/
> Can you reproduce this using excalibur's test case ("ant test" in 
> avalon-excalibur/xmlutil)?

I can confirm this.

Even worse:
1. I used HTMLGenerator without xpath, serialized document to xml, 
removed default namespace (xhtml), saved file to disk, changed testcase 
to read this file and used original xpath: works too.

2.
<map:generate type="html" src="http://cocoon.apache.org/">
   <map:parameter name="xpath" 
value="/html/body/table[3]/tr/td[2]/table/tr[4]/td/div/div/p[3]"/>
</map:generate>
<map:serialize type="html"/>

results in

<p xmlns="http://www.w3.org/1999/xhtml">Cocoon is "web glue for your web 
application development needs". It is a glue that keeps concerns 
separate and allows parallel evolution of all aspects of a web 
application, improving development pace and reducing the chance of 
conflicts.</p>

=> works.

--------------------------

GOT IT!!!

What ever node type this should be: <![if !IE]> <![endif]>
this is the reason for the exception. Why the explicite XPath works? I 
don't know ... At other places in the same file these constructs are 
placed in comments and don't disturb: <!--[if IE]><![endif]--> To which 
language do these constructs belong?

What will we do? Kicking yahoo.com and taking another site (e.g. more 
computer, software development, xml and java related)? Using the 
explicite XPath with yahoo.com?

Joerg