You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Jordi Massaguer <ma...@fossi.uni-weimar.de> on 2001/07/24 16:40:39 UTC

Encoding problem....

Hi all!

I think I have an encoding problem and I don't know how to solve it. The
problem is that when the parser finds "uncommon" characters like for
example "@" it crashes. My xml file stars like this:

<?xml version="1.0" encoding="UTF-8"?>

I think that maybe a solution would be to change the encoding, but I
don't know which to write so it accepts the larger set of characters.

Thank you,

jordi


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


DTD Reading Urgent

Posted by chandru <cu...@connextive.com>.

Hi friends ,
  while reading the dtd using the DTDReader(using the feature
decl-handler,of parser) .the elements content model is giving the a
normalised definition .The model will be normalized so that all parameter
entities are fully resolved and all whitespace is removed,and will include
the enclosing parentheses. how can i stop this,i want the un normalised
definition of the element .
e.g:
 <!ELEMENT NOE (%_NOE_;)>
<!ENTITY % _NOE_ (Msgfun,getInfo,(Reader))>

what iam expecting is:
  element name : NOE
  content Mode: (%_NOE)

what the parser informs in DeclHandler  is
 element name : NOE
  content Mode: (Msgfun,getInfo,(Reader))

how can i regain the original declarations.i.e with out the normalisation.

Expecting your mail soon....


from
chandra sekhar





---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Encoding problem....

Posted by Elliotte Rusty Harold <el...@metalab.unc.edu>.
At 4:40 PM +0200 7/24/01, Jordi Massaguer wrote:
>Hi all!
>
>I think I have an encoding problem and I don't know how to solve it. The
>problem is that when the parser finds "uncommon" characters like for
>example "@" it crashes. My xml file stars like this:
>

Define "crash". I doubt very much it crashes. Java programs rarely do. I suspect it's throwing an exception because it's detected a well-formedness error in your document.

The @ sign is a little unusual. Is that really a character that's giving you problems? By any chance does it immediately follow a non-ASCII character? 

><?xml version="1.0" encoding="UTF-8"?>
>

And is your document actually written in UTF-8? If it's not, that would explain the problem. 

>I think that maybe a solution would be to change the encoding, but I
>don't know which to write so it accepts the larger set of characters.
>

The encoding you declare must match the encoding you actually use. You can't change one without changing the other. 
-- 

+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+ 
|          The XML Bible, 2nd Edition (Hungry Minds, 2001)           |
|              http://www.ibiblio.org/xml/books/bible2/              |
|   http://www.amazon.com/exec/obidos/ISBN=0764547607/cafeaulaitA/   |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      | 
|  Read Cafe con Leche for XML News: http://www.ibiblio.org/xml/     |
+----------------------------------+---------------------------------+

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Encoding problem....

Posted by Ian Roberts <ir...@decisionsoft.com>.
On Tue, 24 Jul 2001, Jordi Massaguer wrote:

> It works fine! thank you! How can I know which characters accept the
> "ISO-8859-1"?

http://www.htmlhelp.com/reference/charset/

Ian

-- 
Ian Roberts, Software Engineer        DecisionSoft Ltd.
Telephone: +44-1865-203192            http://www.decisionsoft.com


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Encoding problem....

Posted by Jordi Massaguer <ma...@fossi.uni-weimar.de>.
It works fine! thank you! How can I know which characters accept the
"ISO-8859-1"?

Thank you,

jordi

Ian Roberts wrote:

> On Tue, 24 Jul 2001, Jordi Massaguer wrote:
>
> > :_( no! it doesn't work. I hadn't tested enough before. What it says is:
> >
> > Character conversion error: "Malformed UTF-8 char -- is an XML encoding
> > declaration missing?" (line number may be too low).
> >
> > What I meant by "crash" is that my program stops working (maybe it sends an
> > exception, I don't know :-) ).
> >
> > Here I have some characters that doesn't work:
> >
> > äöü ÄÖÜ and ß
>
> It looks like you want the encoding "ISO-8859-1"
>
> Ian
>
> --
> Ian Roberts, Software Engineer        DecisionSoft Ltd.
> Telephone: +44-1865-203192            http://www.decisionsoft.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-user-help@xml.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Encoding problem....

Posted by Ian Roberts <ir...@decisionsoft.com>.
On Tue, 24 Jul 2001, Jordi Massaguer wrote:

> :_( no! it doesn't work. I hadn't tested enough before. What it says is:
> 
> Character conversion error: "Malformed UTF-8 char -- is an XML encoding
> declaration missing?" (line number may be too low).
> 
> What I meant by "crash" is that my program stops working (maybe it sends an
> exception, I don't know :-) ).
> 
> Here I have some characters that doesn't work:
> 
> äöü ÄÖÜ and ß

It looks like you want the encoding "ISO-8859-1"

Ian

-- 
Ian Roberts, Software Engineer        DecisionSoft Ltd.
Telephone: +44-1865-203192            http://www.decisionsoft.com


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Encoding problem....

Posted by Jordi Massaguer <ma...@fossi.uni-weimar.de>.
:_( no! it doesn't work. I hadn't tested enough before. What it says is:

Character conversion error: "Malformed UTF-8 char -- is an XML encoding
declaration missing?" (line number may be too low).

What I meant by "crash" is that my program stops working (maybe it sends an
exception, I don't know :-) ).

Here I have some characters that doesn't work:

äöü ÄÖÜ and ß

jordi

Jordi Massaguer wrote:

> Thank you Bob!
>
> I see sometimes the simple solutions are the best, isn't it? You solve my
> problem with a very simple solution and now it works! Thank you again!
>
> jordi
>
> Bob Jamison wrote:
>
> > Jordi Massaguer wrote:
> >
> > >Hi all!
> > >
> > >I think I have an encoding problem and I don't know how to solve it. The
> > >problem is that when the parser finds "uncommon" characters like for
> > >example "@" it crashes. My xml file stars like this:
> > >
> > ><?xml version="1.0" encoding="UTF-8"?>
> > >
> > >I think that maybe a solution would be to change the encoding, but I
> > >don't know which to write so it accepts the larger set of characters.
> > >
> >
> > Jordi,
> >
> > Or you can do what lazy people like me do, and
> > avoid the problem altogether:
> >
> > <?xml version="1.0"?>
> >
> > Bob
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
> > For additional commands, e-mail: xerces-j-user-help@xml.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-user-help@xml.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Encoding problem....

Posted by Jordi Massaguer <ma...@fossi.uni-weimar.de>.
Thank you Bob!

I see sometimes the simple solutions are the best, isn't it? You solve my
problem with a very simple solution and now it works! Thank you again!

jordi


Bob Jamison wrote:

> Jordi Massaguer wrote:
>
> >Hi all!
> >
> >I think I have an encoding problem and I don't know how to solve it. The
> >problem is that when the parser finds "uncommon" characters like for
> >example "@" it crashes. My xml file stars like this:
> >
> ><?xml version="1.0" encoding="UTF-8"?>
> >
> >I think that maybe a solution would be to change the encoding, but I
> >don't know which to write so it accepts the larger set of characters.
> >
>
> Jordi,
>
> Or you can do what lazy people like me do, and
> avoid the problem altogether:
>
> <?xml version="1.0"?>
>
> Bob
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-user-help@xml.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: Encoding problem....

Posted by Bob Jamison <rj...@lincom-asg.com>.
Jordi Massaguer wrote:

>Hi all!
>
>I think I have an encoding problem and I don't know how to solve it. The
>problem is that when the parser finds "uncommon" characters like for
>example "@" it crashes. My xml file stars like this:
>
><?xml version="1.0" encoding="UTF-8"?>
>
>I think that maybe a solution would be to change the encoding, but I
>don't know which to write so it accepts the larger set of characters.
>


Jordi,

Or you can do what lazy people like me do, and
avoid the problem altogether:

<?xml version="1.0"?>




Bob



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org