You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by Rick Reumann <ri...@coxtarget.com> on 2001/01/12 14:46:19 UTC
Re: ampersand problem still, still
Am I missing something obvious, or wouldn't & do the job?
>
> Ian
Actually I tried that first..then I tried both & and & as
others have suggested. I also haven't had much luck using the CDATA
sections to try to escape it as when I do this I don't get an error
but the parsing then breaks at the CDATA section which is just as
bad. To recap, I'm trying to parse and XML doc using SAX2 and when I
get to an ampersand in the doc it causes an error. Any help or
direction where to go from here would be much appreciated.
Thanks,
Rick
> On Thu, 11 Jan 2001, Rick Reumann wrote:
>
> > Hi,
> > I've searched the archives for inofmation related to problems when
> > trying to parse a document that contains ampersands and the threads
> > seem to stop around July but with no solutions. I've just installed
> > version 1.2.3 of xerces.jar hoping this would help but I'm still
> > running into the same problem: when using SAX2 /xerces any ampersand
> > in a document that I try to parse causes the error:
> > org.xml.sax.SAXParseException: The entity name must immediately
> > follow the '&' in the entity reference. I've tried relacing the "&"
> > with various substitutions (such as &) but still no luck. Any
> > suggestions/help ?
>
> Am I missing something obvious, or wouldn't & do the job?
>
> Ian
>
> --
> Ian Roberts | irr@decisionsoft.com
> DecisionSoft Ltd. | http://www.decisionsoft.com/
>
Re: ampersand problem still, still
Posted by Rick Reumann <ri...@coxtarget.com>.
yes when i change it to & the parser splits the line there into
two pieces. Although I don't get an error when i use & it still
produces less than desirable results.
On 12 Jan 2001, at 11:44, David Waite wrote:
> Rick Reumann wrote:
>
> >
> > <bullet>Val-Pak supports mailings with media campaigns on national
> > TV and consumer publications.*Source: 1998 Elrick & Lavidge</bullet>
>
> You mean "Elirck & Lavidge" ?
>
> -David Waite
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org For
> additional commands, e-mail: xerces-j-dev-help@xml.apache.org
>
Re: ampersand problem still, still
Posted by David Waite <dw...@jabber.com>.
Rick Reumann wrote:
>
> <bullet>Val-Pak supports mailings with media campaigns on national TV and consumer publications.*Source: 1998 Elrick & Lavidge</bullet>
You mean "Elirck & Lavidge" ?
-David Waite
Re: ampersand problem still, still
Posted by Rick Reumann <ri...@coxtarget.com>.
On 12 Jan 2001, at 8:13, Luke Blanshard wrote:
> Can you please post a sample document that demonstrates this error
> when you run against the SAX2Count sample program? The following
> sample file is parsed fine by SAX2Count in Xerces version 1.2.1:
>
thanks for looking into this for me, I really appreciate it.
the actual doc looks like this...
<marketingBullets>
<bullet>The Val-Pak exclusive network of Neighborhood Trade Areas reach your best prospective customers without wasting advertising dollars. *Source: 1998 Elrick & Lavidge</bullet>
<bullet>Val-Pak mails over 15 billion ads annually in over 500 million familiar blue envelops to over 50 million unduplicated addresses.</bullet>
<bullet>Val-Pak supports mailings with media campaigns on national TV and consumer publications.*Source: 1998 Elrick & Lavidge</bullet>
<bullet>For over 32 years, Val-Pak mailings have been on-time and cost effective.</bullet>
</marketingBullets>
The parser class that I modified is... (possibly something in here
could be causing the problem. This class is supposed to put put each
row that starts with <bullet> and ends with </bullet> into a bean and
then store that bean in a collection. Works fine if I don't have the
ampersands. I'm new to all this so pardon if I'm doing some really
stupid newbie mistake. If someone would rather, I could send the
actual files).
import java.io.FileReader;
import org.xml.sax.XMLReader;
import org.xml.sax.Attributes;
import org.xml.sax.InputSource;
import org.xml.sax.helpers.XMLReaderFactory;
import org.xml.sax.helpers.DefaultHandler;
import java.util.*;
import FHPMarketingBulletsBean;
import Content;
public class FHPMarketingBulletsParser extends DefaultHandler
{
private Collection col = new ArrayList();
private FHPMarketingBulletsBean bulletsBean = null;
private String currentElement = null;
private String filename = null;
public FHPMarketingBulletsParser()
{
super();
}
public Collection getParsedFile(String contentName) throws Exception
{
Content cont = new Content( contentName );
filename = cont.getFilename();
XMLReader xr = new org.apache.xerces.parsers.SAXParser();
FHPMarketingBulletsParser handler = new FHPMarketingBulletsParser();
xr.setContentHandler(handler);
xr.setErrorHandler(handler);
FileReader r = new FileReader(filename);
xr.parse(new InputSource(r));
return ( handler.col );
}
public void startDocument ()
{
bulletsBean = new FHPMarketingBulletsBean();
}
public void endDocument ()
{
//endDocument
}
public void startElement (String uri, String name, String qName, Attributes atts)
{
currentElement = name;
if ( name.equals("bullet") )
{
bulletsBean = new FHPMarketingBulletsBean();
}
}
public void endElement (String uri, String name, String qName)
{
if ( name.equals("bullet"))
{
col.add( bulletsBean );
}
}
public void characters (char ch[], int start, int length)
{
StringBuffer elementValue = new StringBuffer(length);
for (int i = start; i < start + length; i++)
{
switch (ch[i])
{
case '\\':
elementValue.append(ch[i]);
break;
case '"':
elementValue.append(ch[i]);
break;
case '\n':
//elementValue.append(ch[i]);
break;
case '\r':
//elementValue.append(ch[i]);
break;
case '\t':
//elementValue.append(ch[i]);
break;
default:
elementValue.append(ch[i]);
break;
}
}
String temp = null;
if ( elementValue.length() > 0 && !( elementValue.toString() ).equals(" ") )
{
temp = elementValue.toString();
}
bulletsBean.setElementValue( currentElement, temp );
}
}
> <test>
> Here's some sample text with ampersands & other character
> entity refs embedded. <test attr='& here's an attribute
> value with the same'/>
> </test>
>
> Luke
>
> Rick Reumann wrote:
>
> > Am I missing something obvious, or wouldn't & do the job?
> > >
> > > Ian
> >
> > Actually I tried that first..then I tried both & and & as
> > others have suggested. I also haven't had much luck using the CDATA
> > sections to try to escape it as when I do this I don't get an error
> > but the parsing then breaks at the CDATA section which is just as
> > bad. To recap, I'm trying to parse and XML doc using SAX2 and when I
> > get to an ampersand in the doc it causes an error. Any help or
> > direction where to go from here would be much appreciated. Thanks,
> > Rick
> >
> >
> > > On Thu, 11 Jan 2001, Rick Reumann wrote:
> > >
> > > > Hi,
> > > > I've searched the archives for inofmation related to problems
> > > > when trying to parse a document that contains ampersands and the
> > > > threads seem to stop around July but with no solutions. I've
> > > > just installed version 1.2.3 of xerces.jar hoping this would
> > > > help but I'm still running into the same problem: when using
> > > > SAX2 /xerces any ampersand in a document that I try to parse
> > > > causes the error: org.xml.sax.SAXParseException: The entity name
> > > > must immediately follow the '&' in the entity reference. I've
> > > > tried relacing the "&" with various substitutions (such as
> > > > &) but still no luck. Any suggestions/help ?
> > >
> > > Am I missing something obvious, or wouldn't & do the job?
> > >
> > > Ian
> > >
> > > --
> > > Ian Roberts | irr@decisionsoft.com
> > > DecisionSoft Ltd. | http://www.decisionsoft.com/
> > >
> >
> > --------------------------------------------------------------------
> > - To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
> > For additional commands, e-mail: xerces-j-dev-help@xml.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org For
> additional commands, e-mail: xerces-j-dev-help@xml.apache.org
>
Re: ampersand problem still, still
Posted by Luke Blanshard <lu...@quiq.com>.
Can you please post a sample document that demonstrates this error when you
run against the SAX2Count sample program? The following sample file is
parsed fine by SAX2Count in Xerces version 1.2.1:
<test>
Here's some sample text with ampersands & other character
entity refs embedded.
<test attr='& here's an attribute value with the same'/>
</test>
Luke
Rick Reumann wrote:
> Am I missing something obvious, or wouldn't & do the job?
> >
> > Ian
>
> Actually I tried that first..then I tried both & and & as
> others have suggested. I also haven't had much luck using the CDATA
> sections to try to escape it as when I do this I don't get an error
> but the parsing then breaks at the CDATA section which is just as
> bad. To recap, I'm trying to parse and XML doc using SAX2 and when I
> get to an ampersand in the doc it causes an error. Any help or
> direction where to go from here would be much appreciated.
> Thanks,
> Rick
>
>
> > On Thu, 11 Jan 2001, Rick Reumann wrote:
> >
> > > Hi,
> > > I've searched the archives for inofmation related to problems when
> > > trying to parse a document that contains ampersands and the threads
> > > seem to stop around July but with no solutions. I've just installed
> > > version 1.2.3 of xerces.jar hoping this would help but I'm still
> > > running into the same problem: when using SAX2 /xerces any ampersand
> > > in a document that I try to parse causes the error:
> > > org.xml.sax.SAXParseException: The entity name must immediately
> > > follow the '&' in the entity reference. I've tried relacing the "&"
> > > with various substitutions (such as &) but still no luck. Any
> > > suggestions/help ?
> >
> > Am I missing something obvious, or wouldn't & do the job?
> >
> > Ian
> >
> > --
> > Ian Roberts | irr@decisionsoft.com
> > DecisionSoft Ltd. | http://www.decisionsoft.com/
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-dev-help@xml.apache.org