You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Eric <ej...@ir.iit.edu> on 2001/05/16 06:30:37 UTC

problems formatting output

I've been having some troubling problems using OutputFormat to write
out a DOM.  I'm basically parsing some ugly XML (spaces before '>'s,
no newlines, etc.) into a DOM and then writing it back
out.  I've traversed the DOM and it looks like it gets parsed just
fine, but when I use OutputFormat to write it out, the output still
has ugliness (although different from the input).  Tags are
word-wrapped so that they split over lines, etc.  It looks like this:

                <time>15:04:27</time>   <anten 
                type="broadcast">    <Station num="17650">

What am I doing wrong???

Thanks,
Eric.

-- 
 _____  _ 
| ____|(_)     http://ir.iit.edu/~ej
|  _|  | |     Page me via ICQ at
| |___ | |     http://wwp.mirabilis.com/19022931
|______/ |     or by mailing 19022931@pager.mirabilis.com
     |__/

Re: problems formatting output - Mine works why doesn't yours?

Posted by Eric <ej...@ir.iit.edu>.
Nope, I tried it with your parsing set up and I get the same
thing...  I'm using the latest Xerces.

thanx anyways,
eric.

On Thu, May 17, 2001 at 12:42:52PM +1000, Anthony Ikeda wrote:
> Mine is the same, the code output has return codes throughout it (we have
> some other pre-XML classes that do half of the job).
> 
> The code below actually reformats the whole file. What version of Xerces are
> you using? I've tried it with the latest version (1.3.1) and Xalan 2.0.1.
> 
> It's possible your parsing is not set up properly, give this a go:
> 
>   public static Document parseDoc(String uri){
>     DOMParser parser = new DOMParser();
>     try{
> 
> parser.setFeature("http://apache.org/xml/features/dom/include-ignorable-whit
> espace",false);
>       parser.setFeature("http://xml.org/sax/features/namespaces",false);
> 
> parser.setFeature("http://apache.org/xml/features/dom/create-entity-ref-node
> s",false);
> 
> parser.setFeature("http://apache.org/xml/features/continue-after-fatal-error
> ",true);
> 
>       parser.setFeature("http://xml.org/sax/features/validation",true);
>       parser.parse(uri);
>     } catch (IOException ioe){
>       System.out.println(ioe);
>     } catch(SAXException se){
>       System.out.println(se);
>     }
>     return parser.getDocument();
>   }
> 
> 
> 
> ----- Original Message -----
> From: "Eric" <ej...@ir.iit.edu>
> To: <xe...@xml.apache.org>
> Sent: Thursday, May 17, 2001 10:36 AM
> Subject: Re: problems formatting output - Mine works why doesn't yours?
> 
> 
> > ya, i do almost exactly the same thing.  the problem is that the DOM
> > i'm writing out has been parsed by xerces from an XML document that
> > had no newlines, only tabs between tags...and i was really praying
> > that xerces would figure out that i like every tag on its own line,
> > indented nicely.  ;)
> >
> > eric.
> >
> > On Thu, May 17, 2001 at 09:48:22AM +1000, Anthony Ikeda wrote:
> > > My code outputs fine:
> > >
> > >   public static String serializeDoc(Element doc){
> > >     String xmlString = new String();
> > >     StringWriter stringOut = new StringWriter();
> > >     OutputFormat opfrmt = new OutputFormat(doc.getOwnerDocument(),
> > "UTF-8",
> > > true);
> > >     opfrmt.setIndenting(true);
> > >     opfrmt.setPreserveSpace(false);
> > >
> > >     XMLSerializer serial = new XMLSerializer(stringOut, opfrmt);
> > >
> > >     try{
> > >       serial.asDOMSerializer();
> > >       serial.serialize( doc );
> > >       xmlString = stringOut.toString();
> > >     } catch(java.io.IOException ioe){
> > >       xmlString=null;
> > >     }
> > >     return xmlString;
> > >  }
> > >
> > > Let me know if you still have problems
> > > ----- Original Message -----
> > > From: "Eric" <ej...@ir.iit.edu>
> > > To: <xe...@xml.apache.org>
> > > Sent: Wednesday, May 16, 2001 2:30 PM
> > > Subject: problems formatting output
> > >
> > >
> > > > I've been having some troubling problems using OutputFormat to write
> > > > out a DOM.  I'm basically parsing some ugly XML (spaces before '>'s,
> > > > no newlines, etc.) into a DOM and then writing it back
> > > > out.  I've traversed the DOM and it looks like it gets parsed just
> > > > fine, but when I use OutputFormat to write it out, the output still
> > > > has ugliness (although different from the input).  Tags are
> > > > word-wrapped so that they split over lines, etc.  It looks like
> > this:
> > > >
> > > >                 <time>15:04:27</time>   <anten
> > > >                 type="broadcast">    <Station num="17650">
> > > >
> > > > What am I doing wrong???
> > > >
> > > > Thanks,
> > > > Eric.
> > > >
> > > > --
> > > >  _____  _
> > > > | ____|(_)     http://ir.iit.edu/~ej
> > > > |  _|  | |     Page me via ICQ at
> > > > | |___ | |     http://wwp.mirabilis.com/19022931
> > > > |______/ |     or by mailing 19022931@pager.mirabilis.com
> > > >      |__/
> > > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
> > > For additional commands, e-mail: xerces-j-user-help@xml.apache.org
> > >
> >
> > --
> >  _____  _
> > | ____|(_)     http://ir.iit.edu/~ej
> > |  _|  | |     Page me via ICQ at
> > | |___ | |     http://wwp.mirabilis.com/19022931
> > |______/ |     or by mailing 19022931@pager.mirabilis.com
> >      |__/
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-user-help@xml.apache.org
> 

-- 
 _____  _ 
| ____|(_)     http://ir.iit.edu/~ej
|  _|  | |     Page me via ICQ at
| |___ | |     http://wwp.mirabilis.com/19022931
|______/ |     or by mailing 19022931@pager.mirabilis.com
     |__/

Re: problems formatting output - Mine works why doesn't yours?

Posted by Anthony Ikeda <an...@proxima-tech.com.au>.
Mine is the same, the code output has return codes throughout it (we have
some other pre-XML classes that do half of the job).

The code below actually reformats the whole file. What version of Xerces are
you using? I've tried it with the latest version (1.3.1) and Xalan 2.0.1.

It's possible your parsing is not set up properly, give this a go:

  public static Document parseDoc(String uri){
    DOMParser parser = new DOMParser();
    try{

parser.setFeature("http://apache.org/xml/features/dom/include-ignorable-whit
espace",false);
      parser.setFeature("http://xml.org/sax/features/namespaces",false);

parser.setFeature("http://apache.org/xml/features/dom/create-entity-ref-node
s",false);

parser.setFeature("http://apache.org/xml/features/continue-after-fatal-error
",true);

      parser.setFeature("http://xml.org/sax/features/validation",true);
      parser.parse(uri);
    } catch (IOException ioe){
      System.out.println(ioe);
    } catch(SAXException se){
      System.out.println(se);
    }
    return parser.getDocument();
  }



----- Original Message -----
From: "Eric" <ej...@ir.iit.edu>
To: <xe...@xml.apache.org>
Sent: Thursday, May 17, 2001 10:36 AM
Subject: Re: problems formatting output - Mine works why doesn't yours?


> ya, i do almost exactly the same thing.  the problem is that the DOM
> i'm writing out has been parsed by xerces from an XML document that
> had no newlines, only tabs between tags...and i was really praying
> that xerces would figure out that i like every tag on its own line,
> indented nicely.  ;)
>
> eric.
>
> On Thu, May 17, 2001 at 09:48:22AM +1000, Anthony Ikeda wrote:
> > My code outputs fine:
> >
> >   public static String serializeDoc(Element doc){
> >     String xmlString = new String();
> >     StringWriter stringOut = new StringWriter();
> >     OutputFormat opfrmt = new OutputFormat(doc.getOwnerDocument(),
> "UTF-8",
> > true);
> >     opfrmt.setIndenting(true);
> >     opfrmt.setPreserveSpace(false);
> >
> >     XMLSerializer serial = new XMLSerializer(stringOut, opfrmt);
> >
> >     try{
> >       serial.asDOMSerializer();
> >       serial.serialize( doc );
> >       xmlString = stringOut.toString();
> >     } catch(java.io.IOException ioe){
> >       xmlString=null;
> >     }
> >     return xmlString;
> >  }
> >
> > Let me know if you still have problems
> > ----- Original Message -----
> > From: "Eric" <ej...@ir.iit.edu>
> > To: <xe...@xml.apache.org>
> > Sent: Wednesday, May 16, 2001 2:30 PM
> > Subject: problems formatting output
> >
> >
> > > I've been having some troubling problems using OutputFormat to write
> > > out a DOM.  I'm basically parsing some ugly XML (spaces before '>'s,
> > > no newlines, etc.) into a DOM and then writing it back
> > > out.  I've traversed the DOM and it looks like it gets parsed just
> > > fine, but when I use OutputFormat to write it out, the output still
> > > has ugliness (although different from the input).  Tags are
> > > word-wrapped so that they split over lines, etc.  It looks like
> this:
> > >
> > >                 <time>15:04:27</time>   <anten
> > >                 type="broadcast">    <Station num="17650">
> > >
> > > What am I doing wrong???
> > >
> > > Thanks,
> > > Eric.
> > >
> > > --
> > >  _____  _
> > > | ____|(_)     http://ir.iit.edu/~ej
> > > |  _|  | |     Page me via ICQ at
> > > | |___ | |     http://wwp.mirabilis.com/19022931
> > > |______/ |     or by mailing 19022931@pager.mirabilis.com
> > >      |__/
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
> > For additional commands, e-mail: xerces-j-user-help@xml.apache.org
> >
>
> --
>  _____  _
> | ____|(_)     http://ir.iit.edu/~ej
> |  _|  | |     Page me via ICQ at
> | |___ | |     http://wwp.mirabilis.com/19022931
> |______/ |     or by mailing 19022931@pager.mirabilis.com
>      |__/
>


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: problems formatting output - Mine works why doesn't yours?

Posted by Eric <ej...@ir.iit.edu>.
ya, i do almost exactly the same thing.  the problem is that the DOM
i'm writing out has been parsed by xerces from an XML document that
had no newlines, only tabs between tags...and i was really praying
that xerces would figure out that i like every tag on its own line,
indented nicely.  ;)

eric.

On Thu, May 17, 2001 at 09:48:22AM +1000, Anthony Ikeda wrote:
> My code outputs fine:
> 
>   public static String serializeDoc(Element doc){
>     String xmlString = new String();
>     StringWriter stringOut = new StringWriter();
>     OutputFormat opfrmt = new OutputFormat(doc.getOwnerDocument(), "UTF-8",
> true);
>     opfrmt.setIndenting(true);
>     opfrmt.setPreserveSpace(false);
> 
>     XMLSerializer serial = new XMLSerializer(stringOut, opfrmt);
> 
>     try{
>       serial.asDOMSerializer();
>       serial.serialize( doc );
>       xmlString = stringOut.toString();
>     } catch(java.io.IOException ioe){
>       xmlString=null;
>     }
>     return xmlString;
>  }
> 
> Let me know if you still have problems
> ----- Original Message -----
> From: "Eric" <ej...@ir.iit.edu>
> To: <xe...@xml.apache.org>
> Sent: Wednesday, May 16, 2001 2:30 PM
> Subject: problems formatting output
> 
> 
> > I've been having some troubling problems using OutputFormat to write
> > out a DOM.  I'm basically parsing some ugly XML (spaces before '>'s,
> > no newlines, etc.) into a DOM and then writing it back
> > out.  I've traversed the DOM and it looks like it gets parsed just
> > fine, but when I use OutputFormat to write it out, the output still
> > has ugliness (although different from the input).  Tags are
> > word-wrapped so that they split over lines, etc.  It looks like this:
> >
> >                 <time>15:04:27</time>   <anten
> >                 type="broadcast">    <Station num="17650">
> >
> > What am I doing wrong???
> >
> > Thanks,
> > Eric.
> >
> > --
> >  _____  _
> > | ____|(_)     http://ir.iit.edu/~ej
> > |  _|  | |     Page me via ICQ at
> > | |___ | |     http://wwp.mirabilis.com/19022931
> > |______/ |     or by mailing 19022931@pager.mirabilis.com
> >      |__/
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-user-help@xml.apache.org
> 

-- 
 _____  _ 
| ____|(_)     http://ir.iit.edu/~ej
|  _|  | |     Page me via ICQ at
| |___ | |     http://wwp.mirabilis.com/19022931
|______/ |     or by mailing 19022931@pager.mirabilis.com
     |__/

Re: problems formatting output - Mine works why doesn't yours?

Posted by Anthony Ikeda <an...@proxima-tech.com.au>.
My code outputs fine:

  public static String serializeDoc(Element doc){
    String xmlString = new String();
    StringWriter stringOut = new StringWriter();
    OutputFormat opfrmt = new OutputFormat(doc.getOwnerDocument(), "UTF-8",
true);
    opfrmt.setIndenting(true);
    opfrmt.setPreserveSpace(false);

    XMLSerializer serial = new XMLSerializer(stringOut, opfrmt);

    try{
      serial.asDOMSerializer();
      serial.serialize( doc );
      xmlString = stringOut.toString();
    } catch(java.io.IOException ioe){
      xmlString=null;
    }
    return xmlString;
 }

Let me know if you still have problems
----- Original Message -----
From: "Eric" <ej...@ir.iit.edu>
To: <xe...@xml.apache.org>
Sent: Wednesday, May 16, 2001 2:30 PM
Subject: problems formatting output


> I've been having some troubling problems using OutputFormat to write
> out a DOM.  I'm basically parsing some ugly XML (spaces before '>'s,
> no newlines, etc.) into a DOM and then writing it back
> out.  I've traversed the DOM and it looks like it gets parsed just
> fine, but when I use OutputFormat to write it out, the output still
> has ugliness (although different from the input).  Tags are
> word-wrapped so that they split over lines, etc.  It looks like this:
>
>                 <time>15:04:27</time>   <anten
>                 type="broadcast">    <Station num="17650">
>
> What am I doing wrong???
>
> Thanks,
> Eric.
>
> --
>  _____  _
> | ____|(_)     http://ir.iit.edu/~ej
> |  _|  | |     Page me via ICQ at
> | |___ | |     http://wwp.mirabilis.com/19022931
> |______/ |     or by mailing 19022931@pager.mirabilis.com
>      |__/
>


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org