You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cocoon.apache.org by Rob Johnston <rj...@juniper.net> on 2003/03/30 19:18:44 UTC

Question about TextRecorder in the SAXTransformer

In the TextRecorder.java, I see the following:

public void characters(char ary[], int start, int length)
    throws SAXException {
        String value = new String(ary, start, length).trim();
        if (value.length() > 0) {
            buffer.append(value);
        } else {
            buffer.append(' ');
        }
    }

I am wondering what the logic behind having the trim() in there is.
Several times I have noticed Cocoon "eating" spaces out of the stream as
evidenced in this log:

INFO    2003-03-28 16:05:14.728 [sitemap.transformer.log ] (): [startElement] uri=http://xml.juniper.net/cocoon/work-schedule/1.0,local=start,raw=sched:start
DEBUG   2003-03-28 16:05:14.729 [sitemap.transformer.sche] (): BEGIN startTextRecording
DEBUG   2003-03-28 16:05:14.730 [sitemap.transformer.sche] (): END startTextRecording
INFO    2003-03-28 16:05:14.731 [sitemap.transformer.log ] (): [characters] 03/28/2003 
INFO    2003-03-28 16:05:14.733 [sitemap.transformer.log ] (): [characters] 04:05 PM
INFO    2003-03-28 16:05:14.734 [sitemap.transformer.log ] (): [endElement] uri=http://xml.juniper.net/cocoon/work-schedule/1.0,local=start,qname=sched:start
DEBUG   2003-03-28 16:05:14.735 [sitemap.transformer.sche] (): BEGIN endTextRecording
DEBUG   2003-03-28 16:05:14.736 [sitemap.transformer.sche] (): END endTextRecording text=03/28/200304:05 PM
INFO    2003-03-28 16:05:14.737 [sitemap.transformer.log ] (): [endElement] uri=http://xml.juniper.net/cocoon/work-schedule/1.0,local=every,qname=sched:every
ERROR   2003-03-28 16:05:14.739 [sitemap.transformer.sche] (): URLWorkScheduleTransformer: invalid time: 03/28/200304:05 PM

<sched:start>03/28/2003 04:05 PM</sched:start>

is what is supposed to be being passed to the next transformer, but
instead, it is showing up in two different [characters] events, and
because of the trim(), it ends up being put back together as
"03/28/200304:05 PM" which is invalid.

I was just wondering why the trim() is in there, or if what I am seeing
is expected behavior.

thanks,
rob

Re: Question about TextRecorder in the SAXTransformer

Posted by Rob Johnston <rj...@juniper.net>.
Is the bug the fact that the data is being passed in two [characters]
events or that the .trim() is in TextRecorder.java?  I can nuke the
.trim() locally but don't want to hose anything down the line.

I just want to 'fix' it here locally since I'm on a time constraint, but
I will file a PR for you all also.

thanks,
rob

On Mon, 31 Mar 2003, Carsten Ziegeler wrote:

> Hi Rob,
> 
> this is a bug!
> 
> Carsten
> 
> > -----Original Message-----
> > From: Rob Johnston [mailto:rjohnst@juniper.net]
> > Sent: Sunday, March 30, 2003 7:19 PM
> > To: cocoon-dev@xml.apache.org
> > Subject: Question about TextRecorder in the SAXTransformer
> > 
> > 
> > In the TextRecorder.java, I see the following:
> > 
> > public void characters(char ary[], int start, int length)
> >     throws SAXException {
> >         String value = new String(ary, start, length).trim();
> >         if (value.length() > 0) {
> >             buffer.append(value);
> >         } else {
> >             buffer.append(' ');
> >         }
> >     }
> > 
> > I am wondering what the logic behind having the trim() in there is.
> > Several times I have noticed Cocoon "eating" spaces out of the stream as
> > evidenced in this log:
> > 
> > INFO    2003-03-28 16:05:14.728 [sitemap.transformer.log ] (): 
> > [startElement] 
> > uri=http://xml.juniper.net/cocoon/work-schedule/1.0,local=start,ra
> > w=sched:start
> > DEBUG   2003-03-28 16:05:14.729 [sitemap.transformer.sche] (): 
> > BEGIN startTextRecording
> > DEBUG   2003-03-28 16:05:14.730 [sitemap.transformer.sche] (): 
> > END startTextRecording
> > INFO    2003-03-28 16:05:14.731 [sitemap.transformer.log ] (): 
> > [characters] 03/28/2003 
> > INFO    2003-03-28 16:05:14.733 [sitemap.transformer.log ] (): 
> > [characters] 04:05 PM
> > INFO    2003-03-28 16:05:14.734 [sitemap.transformer.log ] (): 
> > [endElement] 
> > uri=http://xml.juniper.net/cocoon/work-schedule/1.0,local=start,qn
> > ame=sched:start
> > DEBUG   2003-03-28 16:05:14.735 [sitemap.transformer.sche] (): 
> > BEGIN endTextRecording
> > DEBUG   2003-03-28 16:05:14.736 [sitemap.transformer.sche] (): 
> > END endTextRecording text=03/28/200304:05 PM
> > INFO    2003-03-28 16:05:14.737 [sitemap.transformer.log ] (): 
> > [endElement] 
> > uri=http://xml.juniper.net/cocoon/work-schedule/1.0,local=every,qn
> > ame=sched:every
> > ERROR   2003-03-28 16:05:14.739 [sitemap.transformer.sche] (): 
> > URLWorkScheduleTransformer: invalid time: 03/28/200304:05 PM
> > 
> > <sched:start>03/28/2003 04:05 PM</sched:start>
> > 
> > is what is supposed to be being passed to the next transformer, but
> > instead, it is showing up in two different [characters] events, and
> > because of the trim(), it ends up being put back together as
> > "03/28/200304:05 PM" which is invalid.
> > 
> > I was just wondering why the trim() is in there, or if what I am seeing
> > is expected behavior.
> > 
> > thanks,
> > rob

RE: Question about TextRecorder in the SAXTransformer

Posted by Carsten Ziegeler <cz...@s-und-n.de>.
Hi Rob,

this is a bug!

Carsten

> -----Original Message-----
> From: Rob Johnston [mailto:rjohnst@juniper.net]
> Sent: Sunday, March 30, 2003 7:19 PM
> To: cocoon-dev@xml.apache.org
> Subject: Question about TextRecorder in the SAXTransformer
> 
> 
> In the TextRecorder.java, I see the following:
> 
> public void characters(char ary[], int start, int length)
>     throws SAXException {
>         String value = new String(ary, start, length).trim();
>         if (value.length() > 0) {
>             buffer.append(value);
>         } else {
>             buffer.append(' ');
>         }
>     }
> 
> I am wondering what the logic behind having the trim() in there is.
> Several times I have noticed Cocoon "eating" spaces out of the stream as
> evidenced in this log:
> 
> INFO    2003-03-28 16:05:14.728 [sitemap.transformer.log ] (): 
> [startElement] 
> uri=http://xml.juniper.net/cocoon/work-schedule/1.0,local=start,ra
> w=sched:start
> DEBUG   2003-03-28 16:05:14.729 [sitemap.transformer.sche] (): 
> BEGIN startTextRecording
> DEBUG   2003-03-28 16:05:14.730 [sitemap.transformer.sche] (): 
> END startTextRecording
> INFO    2003-03-28 16:05:14.731 [sitemap.transformer.log ] (): 
> [characters] 03/28/2003 
> INFO    2003-03-28 16:05:14.733 [sitemap.transformer.log ] (): 
> [characters] 04:05 PM
> INFO    2003-03-28 16:05:14.734 [sitemap.transformer.log ] (): 
> [endElement] 
> uri=http://xml.juniper.net/cocoon/work-schedule/1.0,local=start,qn
> ame=sched:start
> DEBUG   2003-03-28 16:05:14.735 [sitemap.transformer.sche] (): 
> BEGIN endTextRecording
> DEBUG   2003-03-28 16:05:14.736 [sitemap.transformer.sche] (): 
> END endTextRecording text=03/28/200304:05 PM
> INFO    2003-03-28 16:05:14.737 [sitemap.transformer.log ] (): 
> [endElement] 
> uri=http://xml.juniper.net/cocoon/work-schedule/1.0,local=every,qn
> ame=sched:every
> ERROR   2003-03-28 16:05:14.739 [sitemap.transformer.sche] (): 
> URLWorkScheduleTransformer: invalid time: 03/28/200304:05 PM
> 
> <sched:start>03/28/2003 04:05 PM</sched:start>
> 
> is what is supposed to be being passed to the next transformer, but
> instead, it is showing up in two different [characters] events, and
> because of the trim(), it ends up being put back together as
> "03/28/200304:05 PM" which is invalid.
> 
> I was just wondering why the trim() is in there, or if what I am seeing
> is expected behavior.
> 
> thanks,
> rob