You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Alberto Massari <am...@datadirect.com> on 2006/03/06 13:59:19 UTC

Re: Translation of some special characters in their entities won't work. Do i sth. Wrong or is it a bug?

Hi Joerg,
that's the expected behavior; inside attribute values, apostroph 
(when the attribute value is delimited by quotes) and 'greater than' 
are not ambiguous symbols, and can be used directly.
Anyhow, why does this trouble you?

Alberto

At 01:40 PM 3/6/2006 +0100, Joerg Toellner wrote:
>Hi Group,
>
>Using:
>Xerces-C 2.7.0 (downloaded today xerces-current.zip and compiled it fresh)
>MSVC 7.1
>Windows XP
>
>Problem:
>If i set a value of an attribute node with a string that contains special
>characters like this:
>
>...
>Strcpy(key, "test");
>strcpy(value, "Abc < def > ghi & jkl ' mno \" pqr <= stu >= vwx =< zzz =>
>aaa");
>...
>void dom_SetAttr(DOMNode *node, char *key, char *value)
>{
>     DOMElement *el;
>
>     // Do we have an element node?
>     if(node->getNodeType() == DOM_ELEMENT_NODE)
>     {
>         // Yepp! Go ahead with setting the attribute
>         el = (DOMElement *) node;
>         el->setAttribute(X(key), X(value));
>     }
>}
>
>and serialize the document afterwards, i expect to get the following
>attribut value in the saved document file:
>
>test="Abc &lt; def &gt; ghi &amp; jkl &apos; mno &quot; pqr &lt;= stu &gt;=
>vwx =&lt; zzz =&gt; aaa"
>
>But i get this:
>
>Test="Abc &lt; def > ghi &amp; jkl ' mno &quot; pqr &lt;= stu >= vwx =&lt;
>zzz => aaa"
>
>You see, all special characters are translated in their entities except the
>"greater than" sign and the apostroph.
>
>Do i sth. wrong, or am i missing sth. (setting a feature or whatsoever) or
>is this a bug?
>
>I really would appreciate a hint or a point in the right direction how
>making this work. I have nothing found on the list and/or web or docs.
>
>Thanx in advance for your time and answers
>Joerg Toellner
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
>For additional commands, e-mail: c-dev-help@xerces.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


AW: AW: Translation of some special characters in their entities won't work. Do i sth. Wrong or is it a bug?

Posted by Joerg Toellner <to...@oss-gmbh.de>.
Dear Alberto and Jesse,

Thanks again for your kind words and your help of course.

I surely will point that out to them, but i can't wait till they accept
my/your advice and change the rules. And even if the discussion will be
short it will take a long time before the rules change go through all
instances to taking effect.

I'll check Albertos advise for a soon solution now and then try to move the
stone up to the hill to get the rules changed so i can undo my changes to
xerces source.

I love Xerces as it is, and i am very thankful for all the hard work you and
the whole dev-team has done and will do. 

Bye all
Joerg


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


Wohoo Success! (was Translation of some special characters in their entities won't work. Do i sth. Wrong or is it a bug?)

Posted by Joerg Toellner <to...@oss-gmbh.de>.
Dear Alberto and Jesse,

Dunno if you remember me (surely not :-) ). I asked the a.m. question in
March 2006 in this list. I think it is ok, if i give you a catch-up on this
now.

For you to remember, the long story short:
I have had problems with an authorative organization in medical health care
here in germany. They demand to encode some special characters as entities
in the documents they will accept and for which we have to pass a test
procedure. They demand this for the character '>' as well as for some
others. But Xerces won't encode > to &gt; like it do it for the others like
charm. So i asked here for help.

You two answered me, that it is pretty W3C-legal to NOT encode this >
character (the choice is up to the xml-creater as W3C and you said). And so
Xerces won't and will not do it, and this is correct.

But my authority still demands the encoding. So Alberto wrote to me:
<Alberto>So they are accepting documents that they say are XML but refusing
standard XML documents... what do they claim they support?</Alberto> *LOL* I
love this sentence.

I promised, that i'll try to get them (the authority) to the right
direction, but it will take a while (government and authorities have much
time, you know?). And i did as promised. I confronted them with your
opinions and the W3C-Standards and waited what will happen.

AND TODAY! WOHOOO! I got a new version of the specification they demand for
their documents. And look, there i can read: 

<authority-spec-german>
Zu ersetzende Zeichen
> &gt; beide Schreibweisen sind laut W3C-Spezifikation erlaubt
</authority-spec-german>

Translation:
> &gt; both writings are allowed according W3C-Standards

Our Xerces-created XML-Documents have now passed all tests without errors
and we got our approval-number and can start transmitting such documents to
them from our application.

So we have won and surely have enlightened some people there. Like the boy
scouts motto "Every day a good deed" :-)

Thanks again for supporting me in this case.

CU
Jörg


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


Re: AW: Translation of some special characters in their entities won't work. Do i sth. Wrong or is it a bug?

Posted by Alberto Massari <am...@datadirect.com>.
Hi Joerg,

At 03:30 PM 3/6/2006 +0100, Joerg Toellner wrote:
>Dear Alberto,
>
>Let me first say thanks to you for your speedy reply.
>
>The problem is, that i'm using xerces to create some xml-documents where i
>have strict guidelines from outside my competence. The guidelines come from
>a german healthcare authority organization and i have to send in my
>documents for proofment of correctness to get a permission for our software
>to be used in medical healthcare here in germany. I can't change anything on
>the rules. They were for me as a law from government. I think starting a
>discussion that this characters won't hurt anything is useless. You know
>they are "officials" and not really thinking people. :-(

So they are accepting documents that they say are 
XML but refusing standard XML documents... what do they claim they support?


>I got my first sent in package back from them and they critize that i
>haven't encoded this characters in entities as forced by the documentation
>of the rules. They demand to encode to entities all 5 characters (<, >, ', "
>and &) and so i have to do it, if i want it or not.
>
>If you say, that this is intended behavior, i think i have to do a
>workaround like scanning for this two characters in my values before hand
>them over to xerces, and then xerces will do the rest for me.
>
>Is it legal/possible to give xerces a string like "rate is =&gt; 60% and <=
>80%" for a attribute value? Will xerces then accept the already encoded
>entity and leave it as is while translating with fun the "lower than" char?

No, if you set such a string as an attribute 
value you will obtain a "rate is =&amp;gt; 60% 
and &lt;=80%" as the "&" must be escaped in order 
to be preserved. You are better off modifying 
src/xercesc/framework/XMLFormatter.cpp so that 
the gEscapeChars global variable has, on its 
third row, also chCloseAngle and chSingleQuote as 
characters to be escaped (i.e. the same of the second row plus chLF).

Hope this helps,
Alberto


>Again, thanks very much for your time
>Greeting
>Joerg Toellner
>
>-----Ursprüngliche Nachricht-----
>Von: Alberto Massari [mailto:amassari@datadirect.com]
>Gesendet: Montag, 6. März 2006 13:59
>An: c-dev@xerces.apache.org
>Betreff: Re: Translation of some special characters in their entities won't
>work. Do i sth. Wrong or is it a bug?
>
>Hi Joerg,
>that's the expected behavior; inside attribute values, apostroph (when the
>attribute value is delimited by quotes) and 'greater than'
>are not ambiguous symbols, and can be used directly.
>Anyhow, why does this trouble you?
>
>Alberto
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
>For additional commands, e-mail: c-dev-help@xerces.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


AW: Translation of some special characters in their entities won't work. Do i sth. Wrong or is it a bug?

Posted by Joerg Toellner <to...@oss-gmbh.de>.
Dear Alberto,

Let me first say thanks to you for your speedy reply.

The problem is, that i'm using xerces to create some xml-documents where i
have strict guidelines from outside my competence. The guidelines come from
a german healthcare authority organization and i have to send in my
documents for proofment of correctness to get a permission for our software
to be used in medical healthcare here in germany. I can't change anything on
the rules. They were for me as a law from government. I think starting a
discussion that this characters won't hurt anything is useless. You know
they are "officials" and not really thinking people. :-(

I got my first sent in package back from them and they critize that i
haven't encoded this characters in entities as forced by the documentation
of the rules. They demand to encode to entities all 5 characters (<, >, ', "
and &) and so i have to do it, if i want it or not.

If you say, that this is intended behavior, i think i have to do a
workaround like scanning for this two characters in my values before hand
them over to xerces, and then xerces will do the rest for me.

Is it legal/possible to give xerces a string like "rate is =&gt; 60% and <=
80%" for a attribute value? Will xerces then accept the already encoded
entity and leave it as is while translating with fun the "lower than" char?

Again, thanks very much for your time
Greeting
Joerg Toellner

-----Ursprüngliche Nachricht-----
Von: Alberto Massari [mailto:amassari@datadirect.com] 
Gesendet: Montag, 6. März 2006 13:59
An: c-dev@xerces.apache.org
Betreff: Re: Translation of some special characters in their entities won't
work. Do i sth. Wrong or is it a bug?

Hi Joerg,
that's the expected behavior; inside attribute values, apostroph (when the
attribute value is delimited by quotes) and 'greater than' 
are not ambiguous symbols, and can be used directly.
Anyhow, why does this trouble you?

Alberto



---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org