You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by Ryan Schutt <rs...@vt.edu> on 2000/04/14 18:50:24 UTC

escaped characters

Escaped characters are causing problems when using the Xerces SAX parser.
For example,

<property>Econ &amp; Fiber Sys</property>

When the characters() method is called, I'm only getting the characters "
Fiber Sys".  Why is this happening?

-Ryan



Re: Location DTDs

Posted by Andy Clark <an...@apache.org>.
Theofanis Vassiliou-Gioles wrote:
> I have a problem in locating DTDs within my application.
> [...]
> Any ideas how to tell Xerces where to search?

Register an org.xml.sax.EntityResolver instance to map the system
URI in the file to a local copy. Here's a very simple situation:

  EntityResolver resolver = new EntityResolver() {
    public InputSource resolveEntity(String publicId, String systemId)
      throws SAXException, IOException {

      // resolve my dtd!
      if (systemId != null &&
systemId.equals("http://www.foobar.com/mydtd.dtd")) {
        return new InputSource("file:/user/local/myapp/dtds/mydtd.dtd");
      }

      // let parser handle resolution
      return null;
    }
  };
  parser.setEntityResolver(resolver);

Just make sure that you're using the appropriate file URI syntax.
When I use the file: protocol, I have to use several slashes, e.g:

  file:///c:/data/personal.xml

-- 
Andy Clark * IBM, JTC - Silicon Valley * andyc@apache.org

Location DTDs

Posted by Theofanis Vassiliou-Gioles <va...@fokus.gmd.de>.
Hello,

I have a problem in locating DTDs within my application.

Lets explain:
If an XML-file has an DOCTYPE definition like 

	<!DOCTYPE mytype SYSTEM "mydtd.dtd" >

I would like to write a validating application that has the DTD at a
predefined location, e.g somewhere in the CLASSPATH, and not in the
directory where the XML-file resides.

I think a solution like 
	
	<!DOCTYPE mytype SYSTEM "http://www.foobar.com/mydtd.dtd" >

is not adquate as it requires to be online when using the application.
Also, the solution

	<!DOCTYPE mytype SYSTEM "file:/user/local/myapp/dtds/mydtd.dtd" >

is not an ideal solution, because if the application on an other system
resides for example under c:\programs\myapp this would lead to complete
different SYSTEM ID.

Any ideas how to tell Xerces where to search?

Best regards,

Theo

__________________________
Theofanis Vassiliou-Gioles      email: vassiliou@fokus.gmd.de
GMD FOKUS                       phone: +49 (30) 3463 - 7346
Kaiserin-Augusta-Allee 31       room : 3008
D-10589 Berlin                  http://www.fokus.gmd.de/usr/vassiliou/


Re: escaped characters

Posted by Brett McLaughlin <br...@earthlink.net>.

Ryan Schutt wrote:
> 
> Escaped characters are causing problems when using the Xerces SAX parser.
> For example,
> 
> <property>Econ &amp; Fiber Sys</property>
> 
> When the characters() method is called, I'm only getting the characters "
> Fiber Sys".  Why is this happening?

You are making an incorrect assumption about what the characters()
callback does.  There is no guarantee that any single invocation to
characters() contains all of the textual data for an element.  So the
following element:

<element>Some textual data</element>

will have at least:

1 startElement()
1 characters()
1 endElement()

However, it could have as many characters() invocations as the parser
wants to fire off:

// Equivilate input ch[] to its data value
characters("Som")
characters("e text")
characters("ual d")
characters("ata")

In addition, it is almost always true that it is faster for a parser
implementation to fire off a seperate event for an entity reference,
like &amp.  What is almost certainly happening in your particular
example is:

characters("Econ ")
characters("&")
characters(" Fiber Sys.")

That is the case with Oracle V2, XML4J, Xerces, and Project X as of
today.

In the characters callback you should always do:

void characters(char[] ch, int start, int end) {
  String data = new String(ch, start, end);
  currentData.append(data);
}

where currentData is a StringBuffer of running data for your element. 
Otherwise you keep overwriting the previous method invocations.

Try this:

public class MyHandler extends DefaultHandler {

    StringBuffer elementData;
    String elementName;

    public MyHandler() {
        elementData = new StringBuffer();
    }

    public void startElement(String namespaceURI, String localName,
                        String rawName, Attributes atts)
        throws SAXException {

        elementName = localName;
    }

    public void characters(char[] ch, int start, int end) 
        throws SAXException {

        String data = new String(ch, start, end);
        elementData.append(data);
    }

    public void endElement(String namespaceURI, String localName,
                           String rawName) throws SAXException {

        // You have all your data - use it
        System.out.println("Element: " + elementName + " has data " + 
                           elementData.toString());
        elementData.setLength(0);
    }
}

That will work with that handler registered through setContentHandler()

-Brett

> 
> -Ryan
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-dev-help@xml.apache.org