You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xalan.apache.org by Ittay Dror <it...@qlusters.com> on 2006/04/08 15:54:52 UTC

trying to use id() function in xpath

hi,

i'm new to xalan and xsl. i'm trying to get elements using the id function from an html document. i don't work with an xsl document, just trying to get elements.

this is my code (basically, slightly modified ApplyXPathJaxp sample):
        InputSource xml = new InputSource("/tmp/test.html");
        xml.setEncoding("US-ASCII");
        
        String expr = "id('foo')";
        
        
        // Create a new XPath
        XPathFactory factory = XPathFactory.newInstance();
        XPath xpath = factory.newXPath();
        Object result = null;
        try {
          // compile the XPath expression
          XPathExpression xpathExpr = xpath.compile(expr);
        
          // Evaluate the XPath expression against the input document
          Node node = (Node) xpathExpr.evaluate(xml, XPathConstants.NODE);
          System.out.println(node);
	}
        catch (Exception e) {
          e.printStackTrace();
        }        

and my html is:

<html>
<body>
        <label id="foo">hello</label>
</body>
</html>

i've tried the following:
1. add a doctype <!DOCTYPE html SYSTEM "/tmp/xhtml1-transitional.dtd"> (the dtd and ent files are saved locally). this is not a good option for me, since i want to parse arbitrary html documents (i'll do that with tagsoup and SAX2DOM, but first i want to get this to work)
2. creating a DOM, setting in the config the schema and scema-type, and using XPathAPI, or doc.getElementById
3. setting the system id in the InputSource

from debugging, it seems that the parser doesn't recognize 'id' as being an id.

i would really like the id() function to work (the xpath expressions are used by users, and id() is more natural, than defining keys, though i don't know how to do that either)

thanks for your help,
ittay

-- 
===================================
Ittay Dror 
openQRM Team Leader, 
R&D, Qlusters Inc.
ittayd@qlusters.com
+972-3-6081994 Fax: +972-3-6081841

http://www.openQRM.org
- Keeps your Data-Center Up and Running

Re: trying to use id() function in xpath

Posted by Joanne Tong <jo...@ca.ibm.com>.
Hi

I tried the following and it works:

                try {
                        DocumentBuilderFactory dbf = 
DocumentBuilderFactory.newInstance();
                        DocumentBuilder db = dbf.newDocumentBuilder();  
                        Document doc = db.parse("test.xml");
                        XPath xpath = 
XPathFactory.newInstance().newXPath(); 
                        String expression = "id('foo')";
                        Node node = (Node) xpath.evaluate(expression, 
doc,XPathConstants.NODE);
                        String tmp1 = node.getFirstChild().getNodeValue();
                        System.out.println("tmp1: "+tmp1); 
                } catch(Exception e) {
                        e.printStackTrace();
                }

input is 

<!DOCTYPE html 
     PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html>
<body>
        <label id="foo">hello</label>
</body>
</html>

thanks,

Joanne Tong
Software Developer, XSLT Development



Ittay Dror <it...@qlusters.com> 
04/09/2006 05:39 AM

To
Ittay Dror <it...@qlusters.com>
cc
xalan-j-users@xml.apache.org
Subject
Re: trying to use id() function in xpath






also tried:
public static class HTMLDocumentBuilderFactory extends 
DocumentBuilderFactoryImpl {
                                 public HTMLDocumentBuilderFactory() 
throws SAXException {
                                                 SchemaFactory sfac =
 SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);

            String schemaString = ""
                + "<xs:schema xmlns:xs=\"http://www.w3.org/2001/XMLSchema\
">"
                + "    <xs:attribute name=\"id\" type=\"xs:ID\"/>"
                + "</xs:schema>";
 
            Schema schema = sfac.newSchema(new StreamSource(
                                new StringReader(schemaString)));
 
            setSchema(schema);
            setValidating(true);
                                 }
                 }

and in main:

System.setProperty("javax.xml.parsers.DocumentBuilderFactory", 
HTMLDocumentBuilderFactory.class.getName());


Ittay Dror wrote:
> i've also tried this:
>  Parser p = new Parser(); // from tagsoup
> 
>  SAX2DOM sax2dom = new SAX2DOM();
>  Document doc = (Document)sax2dom.getDOM();
>  DOMConfiguration config = doc.getDomConfig();
>  config.setParameter("schema-type","http://www.w3.org/TR/REC-xml");
>  config.setParameter("schema-location", "/tmp/xhtml1-transitional.dtd");
>  // config.setParameter("datatype-normalization", Boolean.FALSE);
>  //config.setParameter("psvi", Boolean.TRUE);
>  config.setParameter("validate",Boolean.TRUE);
>  doc.insertBefore(doc.getImplementation().createDocumentType("html", 
> null, "/tmp/xhtml1-transitional.dtd"), null); 
>  p.setContentHandler(sax2dom);
> 
> 
>  InputSource docsrc = new InputSource("/tmp/test.html");
>  docsrc.setSystemId("/tmp/xhtml1-transitional.dtd");
>  p.parse(docsrc);
> 
>  System.out.println(doc.getElementById("foo"));
> 
> i get null in the console
> 
> thanx,
> ittay
> 
> Ittay Dror wrote:
>> hi,
>>
>> i'm new to xalan and xsl. i'm trying to get elements using the id 
>> function from an html document. i don't work with an xsl document, 
>> just trying to get elements.
>>
>> this is my code (basically, slightly modified ApplyXPathJaxp sample):
>>        InputSource xml = new InputSource("/tmp/test.html");
>>        xml.setEncoding("US-ASCII");
>>               String expr = "id('foo')";
>>                      // Create a new XPath
>>        XPathFactory factory = XPathFactory.newInstance();
>>        XPath xpath = factory.newXPath();
>>        Object result = null;
>>        try {
>>          // compile the XPath expression
>>          XPathExpression xpathExpr = xpath.compile(expr);
>>                 // Evaluate the XPath expression against the input 
>> document
>>          Node node = (Node) xpathExpr.evaluate(xml, 
XPathConstants.NODE);
>>          System.out.println(node);
>>     }
>>        catch (Exception e) {
>>          e.printStackTrace();
>>        }       and my html is:
>>
>> <html>
>> <body>
>>        <label id="foo">hello</label>
>> </body>
>> </html>
>>
>> i've tried the following:
>> 1. add a doctype <!DOCTYPE html SYSTEM "/tmp/xhtml1-transitional.dtd"> 
>> (the dtd and ent files are saved locally). this is not a good option 
>> for me, since i want to parse arbitrary html documents (i'll do that 
>> with tagsoup and SAX2DOM, but first i want to get this to work)
>> 2. creating a DOM, setting in the config the schema and scema-type, 
>> and using XPathAPI, or doc.getElementById
>> 3. setting the system id in the InputSource
>>
>> from debugging, it seems that the parser doesn't recognize 'id' as 
>> being an id.
>>
>> i would really like the id() function to work (the xpath expressions 
>> are used by users, and id() is more natural, than defining keys, 
>> though i don't know how to do that either)
>>
>> thanks for your help,
>> ittay
>>
> 
> 


-- 
===================================
Ittay Dror 
openQRM Team Leader, 
R&D, Qlusters Inc.
ittayd@qlusters.com
+972-3-6081994 Fax: +972-3-6081841

http://www.openQRM.org
- Keeps your Data-Center Up and Running


Re: trying to use id() function in xpath

Posted by Ittay Dror <it...@qlusters.com>.
also tried:
public static class HTMLDocumentBuilderFactory extends DocumentBuilderFactoryImpl {
		public HTMLDocumentBuilderFactory() throws SAXException {
			SchemaFactory sfac =
                SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);

            String schemaString = ""
                + "<xs:schema xmlns:xs=\"http://www.w3.org/2001/XMLSchema\">"
                + "    <xs:attribute name=\"id\" type=\"xs:ID\"/>"
                + "</xs:schema>";
           
            Schema schema = sfac.newSchema(new StreamSource(
                                new StringReader(schemaString)));
           
            setSchema(schema);
            setValidating(true);
		}
	}

and in main:

System.setProperty("javax.xml.parsers.DocumentBuilderFactory", HTMLDocumentBuilderFactory.class.getName());


Ittay Dror wrote:
> i've also tried this:
>  Parser p = new Parser(); // from tagsoup
> 
>  SAX2DOM sax2dom = new SAX2DOM();
>  Document doc = (Document)sax2dom.getDOM();
>  DOMConfiguration config = doc.getDomConfig();
>  config.setParameter("schema-type","http://www.w3.org/TR/REC-xml");
>  config.setParameter("schema-location", "/tmp/xhtml1-transitional.dtd");
>  // config.setParameter("datatype-normalization", Boolean.FALSE);
>  //config.setParameter("psvi", Boolean.TRUE);
>  config.setParameter("validate",Boolean.TRUE);
>  doc.insertBefore(doc.getImplementation().createDocumentType("html", 
> null, "/tmp/xhtml1-transitional.dtd"), null); 
>  p.setContentHandler(sax2dom);
>  
>  
>  InputSource docsrc = new InputSource("/tmp/test.html");
>  docsrc.setSystemId("/tmp/xhtml1-transitional.dtd");
>  p.parse(docsrc);
>  
>  System.out.println(doc.getElementById("foo"));
>  
> i get null in the console
> 
> thanx,
> ittay
> 
> Ittay Dror wrote:
>> hi,
>>
>> i'm new to xalan and xsl. i'm trying to get elements using the id 
>> function from an html document. i don't work with an xsl document, 
>> just trying to get elements.
>>
>> this is my code (basically, slightly modified ApplyXPathJaxp sample):
>>        InputSource xml = new InputSource("/tmp/test.html");
>>        xml.setEncoding("US-ASCII");
>>               String expr = "id('foo')";
>>                      // Create a new XPath
>>        XPathFactory factory = XPathFactory.newInstance();
>>        XPath xpath = factory.newXPath();
>>        Object result = null;
>>        try {
>>          // compile the XPath expression
>>          XPathExpression xpathExpr = xpath.compile(expr);
>>                 // Evaluate the XPath expression against the input 
>> document
>>          Node node = (Node) xpathExpr.evaluate(xml, XPathConstants.NODE);
>>          System.out.println(node);
>>     }
>>        catch (Exception e) {
>>          e.printStackTrace();
>>        }       and my html is:
>>
>> <html>
>> <body>
>>        <label id="foo">hello</label>
>> </body>
>> </html>
>>
>> i've tried the following:
>> 1. add a doctype <!DOCTYPE html SYSTEM "/tmp/xhtml1-transitional.dtd"> 
>> (the dtd and ent files are saved locally). this is not a good option 
>> for me, since i want to parse arbitrary html documents (i'll do that 
>> with tagsoup and SAX2DOM, but first i want to get this to work)
>> 2. creating a DOM, setting in the config the schema and scema-type, 
>> and using XPathAPI, or doc.getElementById
>> 3. setting the system id in the InputSource
>>
>> from debugging, it seems that the parser doesn't recognize 'id' as 
>> being an id.
>>
>> i would really like the id() function to work (the xpath expressions 
>> are used by users, and id() is more natural, than defining keys, 
>> though i don't know how to do that either)
>>
>> thanks for your help,
>> ittay
>>
> 
> 


-- 
===================================
Ittay Dror 
openQRM Team Leader, 
R&D, Qlusters Inc.
ittayd@qlusters.com
+972-3-6081994 Fax: +972-3-6081841

http://www.openQRM.org
- Keeps your Data-Center Up and Running

Re: trying to use id() function in xpath

Posted by Ittay Dror <it...@qlusters.com>.
i've also tried this:
  Parser p = new Parser(); // from tagsoup

  SAX2DOM sax2dom = new SAX2DOM();
  Document doc = (Document)sax2dom.getDOM();
  DOMConfiguration config = doc.getDomConfig();
  config.setParameter("schema-type","http://www.w3.org/TR/REC-xml");
  config.setParameter("schema-location", "/tmp/xhtml1-transitional.dtd");
  // config.setParameter("datatype-normalization", Boolean.FALSE);
  //config.setParameter("psvi", Boolean.TRUE);
  config.setParameter("validate",Boolean.TRUE);
  doc.insertBefore(doc.getImplementation().createDocumentType("html", null, "/tmp/xhtml1-transitional.dtd"), null); 
  p.setContentHandler(sax2dom);
  
  
  InputSource docsrc = new InputSource("/tmp/test.html");
  docsrc.setSystemId("/tmp/xhtml1-transitional.dtd");
  p.parse(docsrc);
  
  System.out.println(doc.getElementById("foo"));
  
i get null in the console

thanx,
ittay

Ittay Dror wrote:
> hi,
> 
> i'm new to xalan and xsl. i'm trying to get elements using the id 
> function from an html document. i don't work with an xsl document, just 
> trying to get elements.
> 
> this is my code (basically, slightly modified ApplyXPathJaxp sample):
>        InputSource xml = new InputSource("/tmp/test.html");
>        xml.setEncoding("US-ASCII");
>               String expr = "id('foo')";
>                      // Create a new XPath
>        XPathFactory factory = XPathFactory.newInstance();
>        XPath xpath = factory.newXPath();
>        Object result = null;
>        try {
>          // compile the XPath expression
>          XPathExpression xpathExpr = xpath.compile(expr);
>                 // Evaluate the XPath expression against the input document
>          Node node = (Node) xpathExpr.evaluate(xml, XPathConstants.NODE);
>          System.out.println(node);
>     }
>        catch (Exception e) {
>          e.printStackTrace();
>        }       
> and my html is:
> 
> <html>
> <body>
>        <label id="foo">hello</label>
> </body>
> </html>
> 
> i've tried the following:
> 1. add a doctype <!DOCTYPE html SYSTEM "/tmp/xhtml1-transitional.dtd"> 
> (the dtd and ent files are saved locally). this is not a good option for 
> me, since i want to parse arbitrary html documents (i'll do that with 
> tagsoup and SAX2DOM, but first i want to get this to work)
> 2. creating a DOM, setting in the config the schema and scema-type, and 
> using XPathAPI, or doc.getElementById
> 3. setting the system id in the InputSource
> 
> from debugging, it seems that the parser doesn't recognize 'id' as being 
> an id.
> 
> i would really like the id() function to work (the xpath expressions are 
> used by users, and id() is more natural, than defining keys, though i 
> don't know how to do that either)
> 
> thanks for your help,
> ittay
> 


-- 
===================================
Ittay Dror 
openQRM Team Leader, 
R&D, Qlusters Inc.
ittayd@qlusters.com
+972-3-6081994 Fax: +972-3-6081841

http://www.openQRM.org
- Keeps your Data-Center Up and Running