You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by Rick Reumann <ri...@coxtarget.com> on 2001/01/11 19:13:26 UTC
ampersand problem still?
Hi,
I've searched the archives for inofmation related to problems when
trying to parse a document that contains ampersands and the threads
seem to stop around July but with no solutions. I've just installed
version 1.2.3 of xerces.jar hoping this would help but I'm still
running into the same problem: when using SAX2 /xerces any ampersand
in a document that I try to parse causes the error:
org.xml.sax.SAXParseException: The entity name must immediately follow the '&' in the entity reference.
I've tried relacing the "&" with various substitutions (such as
&) but still no luck.
Any suggestions/help ?
thanks,
Rick
Re: ampersand problem still, still
Posted by Rick Reumann <ri...@coxtarget.com>.
yes when i change it to & the parser splits the line there into
two pieces. Although I don't get an error when i use & it still
produces less than desirable results.
On 12 Jan 2001, at 11:44, David Waite wrote:
> Rick Reumann wrote:
>
> >
> > <bullet>Val-Pak supports mailings with media campaigns on national
> > TV and consumer publications.*Source: 1998 Elrick & Lavidge</bullet>
>
> You mean "Elirck & Lavidge" ?
>
> -David Waite
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org For
> additional commands, e-mail: xerces-j-dev-help@xml.apache.org
>
Re: ampersand problem still, still
Posted by David Waite <dw...@jabber.com>.
Rick Reumann wrote:
>
> <bullet>Val-Pak supports mailings with media campaigns on national TV and consumer publications.*Source: 1998 Elrick & Lavidge</bullet>
You mean "Elirck & Lavidge" ?
-David Waite
Re: ampersand problem still, still
Posted by Rick Reumann <ri...@coxtarget.com>.
On 12 Jan 2001, at 8:13, Luke Blanshard wrote:
> Can you please post a sample document that demonstrates this error
> when you run against the SAX2Count sample program? The following
> sample file is parsed fine by SAX2Count in Xerces version 1.2.1:
>
thanks for looking into this for me, I really appreciate it.
the actual doc looks like this...
<marketingBullets>
<bullet>The Val-Pak exclusive network of Neighborhood Trade Areas reach your best prospective customers without wasting advertising dollars. *Source: 1998 Elrick & Lavidge</bullet>
<bullet>Val-Pak mails over 15 billion ads annually in over 500 million familiar blue envelops to over 50 million unduplicated addresses.</bullet>
<bullet>Val-Pak supports mailings with media campaigns on national TV and consumer publications.*Source: 1998 Elrick & Lavidge</bullet>
<bullet>For over 32 years, Val-Pak mailings have been on-time and cost effective.</bullet>
</marketingBullets>
The parser class that I modified is... (possibly something in here
could be causing the problem. This class is supposed to put put each
row that starts with <bullet> and ends with </bullet> into a bean and
then store that bean in a collection. Works fine if I don't have the
ampersands. I'm new to all this so pardon if I'm doing some really
stupid newbie mistake. If someone would rather, I could send the
actual files).
import java.io.FileReader;
import org.xml.sax.XMLReader;
import org.xml.sax.Attributes;
import org.xml.sax.InputSource;
import org.xml.sax.helpers.XMLReaderFactory;
import org.xml.sax.helpers.DefaultHandler;
import java.util.*;
import FHPMarketingBulletsBean;
import Content;
public class FHPMarketingBulletsParser extends DefaultHandler
{
private Collection col = new ArrayList();
private FHPMarketingBulletsBean bulletsBean = null;
private String currentElement = null;
private String filename = null;
public FHPMarketingBulletsParser()
{
super();
}
public Collection getParsedFile(String contentName) throws Exception
{
Content cont = new Content( contentName );
filename = cont.getFilename();
XMLReader xr = new org.apache.xerces.parsers.SAXParser();
FHPMarketingBulletsParser handler = new FHPMarketingBulletsParser();
xr.setContentHandler(handler);
xr.setErrorHandler(handler);
FileReader r = new FileReader(filename);
xr.parse(new InputSource(r));
return ( handler.col );
}
public void startDocument ()
{
bulletsBean = new FHPMarketingBulletsBean();
}
public void endDocument ()
{
//endDocument
}
public void startElement (String uri, String name, String qName, Attributes atts)
{
currentElement = name;
if ( name.equals("bullet") )
{
bulletsBean = new FHPMarketingBulletsBean();
}
}
public void endElement (String uri, String name, String qName)
{
if ( name.equals("bullet"))
{
col.add( bulletsBean );
}
}
public void characters (char ch[], int start, int length)
{
StringBuffer elementValue = new StringBuffer(length);
for (int i = start; i < start + length; i++)
{
switch (ch[i])
{
case '\\':
elementValue.append(ch[i]);
break;
case '"':
elementValue.append(ch[i]);
break;
case '\n':
//elementValue.append(ch[i]);
break;
case '\r':
//elementValue.append(ch[i]);
break;
case '\t':
//elementValue.append(ch[i]);
break;
default:
elementValue.append(ch[i]);
break;
}
}
String temp = null;
if ( elementValue.length() > 0 && !( elementValue.toString() ).equals(" ") )
{
temp = elementValue.toString();
}
bulletsBean.setElementValue( currentElement, temp );
}
}
> <test>
> Here's some sample text with ampersands & other character
> entity refs embedded. <test attr='& here's an attribute
> value with the same'/>
> </test>
>
> Luke
>
> Rick Reumann wrote:
>
> > Am I missing something obvious, or wouldn't & do the job?
> > >
> > > Ian
> >
> > Actually I tried that first..then I tried both & and & as
> > others have suggested. I also haven't had much luck using the CDATA
> > sections to try to escape it as when I do this I don't get an error
> > but the parsing then breaks at the CDATA section which is just as
> > bad. To recap, I'm trying to parse and XML doc using SAX2 and when I
> > get to an ampersand in the doc it causes an error. Any help or
> > direction where to go from here would be much appreciated. Thanks,
> > Rick
> >
> >
> > > On Thu, 11 Jan 2001, Rick Reumann wrote:
> > >
> > > > Hi,
> > > > I've searched the archives for inofmation related to problems
> > > > when trying to parse a document that contains ampersands and the
> > > > threads seem to stop around July but with no solutions. I've
> > > > just installed version 1.2.3 of xerces.jar hoping this would
> > > > help but I'm still running into the same problem: when using
> > > > SAX2 /xerces any ampersand in a document that I try to parse
> > > > causes the error: org.xml.sax.SAXParseException: The entity name
> > > > must immediately follow the '&' in the entity reference. I've
> > > > tried relacing the "&" with various substitutions (such as
> > > > &) but still no luck. Any suggestions/help ?
> > >
> > > Am I missing something obvious, or wouldn't & do the job?
> > >
> > > Ian
> > >
> > > --
> > > Ian Roberts | irr@decisionsoft.com
> > > DecisionSoft Ltd. | http://www.decisionsoft.com/
> > >
> >
> > --------------------------------------------------------------------
> > - To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
> > For additional commands, e-mail: xerces-j-dev-help@xml.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org For
> additional commands, e-mail: xerces-j-dev-help@xml.apache.org
>
Re: ampersand problem still, still
Posted by Luke Blanshard <lu...@quiq.com>.
Can you please post a sample document that demonstrates this error when you
run against the SAX2Count sample program? The following sample file is
parsed fine by SAX2Count in Xerces version 1.2.1:
<test>
Here's some sample text with ampersands & other character
entity refs embedded.
<test attr='& here's an attribute value with the same'/>
</test>
Luke
Rick Reumann wrote:
> Am I missing something obvious, or wouldn't & do the job?
> >
> > Ian
>
> Actually I tried that first..then I tried both & and & as
> others have suggested. I also haven't had much luck using the CDATA
> sections to try to escape it as when I do this I don't get an error
> but the parsing then breaks at the CDATA section which is just as
> bad. To recap, I'm trying to parse and XML doc using SAX2 and when I
> get to an ampersand in the doc it causes an error. Any help or
> direction where to go from here would be much appreciated.
> Thanks,
> Rick
>
>
> > On Thu, 11 Jan 2001, Rick Reumann wrote:
> >
> > > Hi,
> > > I've searched the archives for inofmation related to problems when
> > > trying to parse a document that contains ampersands and the threads
> > > seem to stop around July but with no solutions. I've just installed
> > > version 1.2.3 of xerces.jar hoping this would help but I'm still
> > > running into the same problem: when using SAX2 /xerces any ampersand
> > > in a document that I try to parse causes the error:
> > > org.xml.sax.SAXParseException: The entity name must immediately
> > > follow the '&' in the entity reference. I've tried relacing the "&"
> > > with various substitutions (such as &) but still no luck. Any
> > > suggestions/help ?
> >
> > Am I missing something obvious, or wouldn't & do the job?
> >
> > Ian
> >
> > --
> > Ian Roberts | irr@decisionsoft.com
> > DecisionSoft Ltd. | http://www.decisionsoft.com/
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-dev-help@xml.apache.org
Re: ampersand problem still, still
Posted by Rick Reumann <ri...@coxtarget.com>.
Am I missing something obvious, or wouldn't & do the job?
>
> Ian
Actually I tried that first..then I tried both & and & as
others have suggested. I also haven't had much luck using the CDATA
sections to try to escape it as when I do this I don't get an error
but the parsing then breaks at the CDATA section which is just as
bad. To recap, I'm trying to parse and XML doc using SAX2 and when I
get to an ampersand in the doc it causes an error. Any help or
direction where to go from here would be much appreciated.
Thanks,
Rick
> On Thu, 11 Jan 2001, Rick Reumann wrote:
>
> > Hi,
> > I've searched the archives for inofmation related to problems when
> > trying to parse a document that contains ampersands and the threads
> > seem to stop around July but with no solutions. I've just installed
> > version 1.2.3 of xerces.jar hoping this would help but I'm still
> > running into the same problem: when using SAX2 /xerces any ampersand
> > in a document that I try to parse causes the error:
> > org.xml.sax.SAXParseException: The entity name must immediately
> > follow the '&' in the entity reference. I've tried relacing the "&"
> > with various substitutions (such as &) but still no luck. Any
> > suggestions/help ?
>
> Am I missing something obvious, or wouldn't & do the job?
>
> Ian
>
> --
> Ian Roberts | irr@decisionsoft.com
> DecisionSoft Ltd. | http://www.decisionsoft.com/
>