You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-users@xerces.apache.org by "ritesh.dhope" <ri...@siemens.com> on 2010/04/06 23:20:39 UTC

Xerces CPP does not replace tab character in attribute value with hexadecimal char reference

I have a sample xerces writer code to create an attribute with tab character
in value. When I write the xml file out, the tab character is preserved and
written as literal tab. I want to convert this tab character to its
hexadecimal char reference (&#x9;) so that this tab character can be parsed
back. I tried the same code using Xerces-J and it is replacing the tab
character with hexadecimal char reference.

Here is the code snippets.
-------------------
C++
-------------------
	XMLPlatformUtils::Initialize();

	DOMImplementation * pDOMImplementation = NULL;

	pDOMImplementation =
DOMImplementationRegistry::getDOMImplementation(XMLString::transcode("core"));

	DOMDocument * pDOMDocument = NULL;


	pDOMDocument = pDOMImplementation->createDocument(0, L"Hello_World", 0);



	DOMElement * pRootElement = NULL;
	pRootElement = pDOMDocument->getDocumentElement();



	DOMElement * pRow = NULL;


	pRow = pDOMDocument->createElement(L"row");
    pRow->setAttribute(L"description", L"\tThe value of PI");
	pRootElement->appendChild(pRow);

	DOMElement * pRow1 = NULL;
	pRow1 = pDOMDocument->createElement(L"row");
	pRow1->setAttribute(L"description", L"&#x09;The vale of PI");
	pRootElement->appendChild(pRow1);

	DOMElement * pRow2 = NULL;


	pRow2 = pDOMDocument->createElement(L"row");
    pRow2->setAttribute(L"description", L"\nThe value of PI");
	pRootElement->appendChild(pRow2);

	DOMWriter *pwriter = pDOMImplementation->createDOMWriter();
	XMLFormatTarget *pTarget = new
LocalFileFormatTarget("f:\\june1\\june.xml");


	pwriter->writeNode(pTarget, *pDOMDocument);
-------------------------------------------------------------------------

------------------
Java
------------------
            Document zTest =
DocumentBuilderFactoryImpl.newInstance().newDocumentBuilder().newDocument();
            Element base = zTest.createElement( "Base" );
            base.setAttribute( "test", "test tab \t, 	 and \u0009 as &#x09;
value" );
            zTest.appendChild( base );

            Serializer z = new XMLSerializer();
            
            z.setOutputByteStream( System.out );
            z.asDOMSerializer().serialize( zTest );
----------------------------------------------------------------------
-- 
View this message in context: http://old.nabble.com/Xerces-CPP-does-not-replace-tab-character-in-attribute-value-with-hexadecimal-char-reference-tp28157670p28157670.html
Sent from the Xerces - C - Users mailing list archive at Nabble.com.


Re: Xerces CPP does not replace tab character in attribute value with hexadecimal char reference

Posted by Alberto Massari <Al...@progress.com>.
In this case, change the declaration at the beginning of 
framework/XMLFormatter.cpp to be this one:

static const unsigned int kEscapeCount = 7;
static const XMLCh 
gEscapeChars[XMLFormatter::EscapeFlags_Count][kEscapeCount] =
{
         { chNull      , chNull       , chNull        , chNull       , 
chNull        , chNull    , chNull }
     ,   { chAmpersand , chCloseAngle , chDoubleQuote , chOpenAngle  , 
chSingleQuote , chNull    , chNull }
     ,   { chAmpersand , chOpenAngle  , chDoubleQuote , chLF         , 
chCR          , chHTab    , chNull }
     ,   { chAmpersand , chOpenAngle  , chCloseAngle  , chNull       , 
chNull        , chNull    , chNull }
};

Alberto

On 4/7/2010 5:54 PM, ritesh.dhope wrote:
> I am using XercesC 2.7.0 version.
>
>
> Alberto Massari-2 wrote:
>    
>> What version of Xerces are you using? If it's not a 3.x version, you are
>> hitting the bug described at
>> http://issues.apache.org/jira/browse/XERCESC-1547
>>
>> Alberto
>>
>> On 4/7/2010 5:15 PM, ritesh.dhope wrote:
>>      
>>> I don't mind taking that, but there is descrepency in how XercesC and
>>> XercesJ
>>> behaves. XercesC writes tab character as literal tab whereas XercesJ
>>> writes
>>> it to Hex char reference code.
>>>
>>> Writing a literal tab is bad idea as parser will convert it to space when
>>> it
>>> reads xml file. Hex code is read back as tab character by parser.
>>>
>>> - Ritesh
>>>
>>>
>>>
>>> John Lilley wrote:
>>>
>>>        
>>>> Perhaps this is good to read.  I am not expert enough to say, but it
>>>> seems
>>>> to me that tabs in attribute values may not be preserved:
>>>> http://www.w3.org/TR/REC-xml/#AVNormalize
>>>>
>>>> john
>>>>
>>>> -----Original Message-----
>>>> From: ritesh.dhope [mailto:ritesh.dhope@siemens.com]
>>>> Sent: Wednesday, April 07, 2010 8:00 AM
>>>> To: c-users@xerces.apache.org
>>>> Subject: RE: Xerces CPP does not replace tab character in attribute
>>>> value
>>>> with hexadecimal char reference
>>>>
>>>>
>>>> I tried writing \t as well as&#x09. \t gets written as literal tab
>>>> whereas
>>>> &#x09; gets written as&amp;#x09; replacing the&   value.
>>>>
>>>> What I need finally is&#x09; written out to xml file. Alternatively, Is
>>>> there any way to escape&   in the value of&#x09; getting replaced to&amp;
>>>>
>>>> - Ritesh
>>>> --------------------------------------------------------------
>>>>
>>>> John Lilley wrote:
>>>>
>>>>          
>>>>> I think that was the first line in the example:
>>>>>
>>>>>            
>>>>>>        pRow->setAttribute(L"description", L"\tThe value of PI");
>>>>>>
>>>>>>              
>>>>> -----Original Message-----
>>>>> From: Alberto Massari [mailto:Alberto.Massari@progress.com]
>>>>> Sent: Wednesday, April 07, 2010 1:10 AM
>>>>> To: c-users@xerces.apache.org
>>>>> Subject: Re: Xerces CPP does not replace tab character in attribute
>>>>> value
>>>>> with hexadecimal char reference
>>>>>
>>>>> If you invoke
>>>>>
>>>>> pRow1->setAttribute(L"description", L"&#x09;The vale of PI");
>>>>>
>>>>> what you are writing is the literal "&#x09;", not a tab character. Have
>>>>> you tried writing the same content as in the Java version,
>>>>>
>>>>> pRow1->setAttribute(L"description", L"test tab \t,     and \x09
>>>>> as&#x09;
>>>>> value" );
>>>>>
>>>>> Alberto
>>>>>
>>>>>
>>>>> On 4/6/2010 11:20 PM, ritesh.dhope wrote:
>>>>>
>>>>>            
>>>>>> I have a sample xerces writer code to create an attribute with tab
>>>>>> character
>>>>>> in value. When I write the xml file out, the tab character is
>>>>>> preserved
>>>>>> and
>>>>>> written as literal tab. I want to convert this tab character to its
>>>>>> hexadecimal char reference (&#x9;) so that this tab character can be
>>>>>> parsed
>>>>>> back. I tried the same code using Xerces-J and it is replacing the tab
>>>>>> character with hexadecimal char reference.
>>>>>>
>>>>>> Here is the code snippets.
>>>>>> -------------------
>>>>>> C++
>>>>>> -------------------
>>>>>>        XMLPlatformUtils::Initialize();
>>>>>>
>>>>>>        DOMImplementation * pDOMImplementation = NULL;
>>>>>>
>>>>>>        pDOMImplementation =
>>>>>> DOMImplementationRegistry::getDOMImplementation(XMLString::transcode("core"));
>>>>>>
>>>>>>        DOMDocument * pDOMDocument = NULL;
>>>>>>
>>>>>>
>>>>>>        pDOMDocument = pDOMImplementation->createDocument(0,
>>>>>> L"Hello_World", 0);
>>>>>>
>>>>>>
>>>>>>
>>>>>>        DOMElement * pRootElement = NULL;
>>>>>>        pRootElement = pDOMDocument->getDocumentElement();
>>>>>>
>>>>>>
>>>>>>
>>>>>>        DOMElement * pRow = NULL;
>>>>>>
>>>>>>
>>>>>>        pRow = pDOMDocument->createElement(L"row");
>>>>>>        pRow->setAttribute(L"description", L"\tThe value of PI");
>>>>>>        pRootElement->appendChild(pRow);
>>>>>>
>>>>>>        DOMElement * pRow1 = NULL;
>>>>>>        pRow1 = pDOMDocument->createElement(L"row");
>>>>>>        pRow1->setAttribute(L"description", L"&#x09;The vale of PI");
>>>>>>        pRootElement->appendChild(pRow1);
>>>>>>
>>>>>>        DOMElement * pRow2 = NULL;
>>>>>>
>>>>>>
>>>>>>        pRow2 = pDOMDocument->createElement(L"row");
>>>>>>        pRow2->setAttribute(L"description", L"\nThe value of PI");
>>>>>>        pRootElement->appendChild(pRow2);
>>>>>>
>>>>>>        DOMWriter *pwriter = pDOMImplementation->createDOMWriter();
>>>>>>        XMLFormatTarget *pTarget = new
>>>>>> LocalFileFormatTarget("f:\\june1\\june.xml");
>>>>>>
>>>>>>
>>>>>>        pwriter->writeNode(pTarget, *pDOMDocument);
>>>>>> -------------------------------------------------------------------------
>>>>>>
>>>>>> ------------------
>>>>>> Java
>>>>>> ------------------
>>>>>>                Document zTest =
>>>>>> DocumentBuilderFactoryImpl.newInstance().newDocumentBuilder().newDocument();
>>>>>>                Element base = zTest.createElement( "Base" );
>>>>>>                base.setAttribute( "test", "test tab \t,         and
>>>>>> \u0009
>>>>>> as&#x09;
>>>>>> value" );
>>>>>>                zTest.appendChild( base );
>>>>>>
>>>>>>                Serializer z = new XMLSerializer();
>>>>>>
>>>>>>                z.setOutputByteStream( System.out );
>>>>>>                z.asDOMSerializer().serialize( zTest );
>>>>>> ----------------------------------------------------------------------
>>>>>>
>>>>>>
>>>>>>              
>>>>>
>>>>>
>>>>>            
>>>> --
>>>> View this message in context:
>>>> http://old.nabble.com/Xerces-CPP-does-not-replace-tab-character-in-attribute-value-with-hexadecimal-char-reference-tp28157670p28165406.html
>>>> Sent from the Xerces - C - Users mailing list archive at Nabble.com.
>>>>
>>>>
>>>>
>>>>
>>>>          
>>>
>>>        
>>
>>
>>      
>    


Re: Xerces CPP does not replace tab character in attribute value with hexadecimal char reference

Posted by "ritesh.dhope" <ri...@siemens.com>.
I am using XercesC 2.7.0 version.


Alberto Massari-2 wrote:
> 
> What version of Xerces are you using? If it's not a 3.x version, you are 
> hitting the bug described at 
> http://issues.apache.org/jira/browse/XERCESC-1547
> 
> Alberto
> 
> On 4/7/2010 5:15 PM, ritesh.dhope wrote:
>> I don't mind taking that, but there is descrepency in how XercesC and
>> XercesJ
>> behaves. XercesC writes tab character as literal tab whereas XercesJ
>> writes
>> it to Hex char reference code.
>>
>> Writing a literal tab is bad idea as parser will convert it to space when
>> it
>> reads xml file. Hex code is read back as tab character by parser.
>>
>> - Ritesh
>>
>>
>>
>> John Lilley wrote:
>>    
>>> Perhaps this is good to read.  I am not expert enough to say, but it
>>> seems
>>> to me that tabs in attribute values may not be preserved:
>>> http://www.w3.org/TR/REC-xml/#AVNormalize
>>>
>>> john
>>>
>>> -----Original Message-----
>>> From: ritesh.dhope [mailto:ritesh.dhope@siemens.com]
>>> Sent: Wednesday, April 07, 2010 8:00 AM
>>> To: c-users@xerces.apache.org
>>> Subject: RE: Xerces CPP does not replace tab character in attribute
>>> value
>>> with hexadecimal char reference
>>>
>>>
>>> I tried writing \t as well as&#x09. \t gets written as literal tab
>>> whereas
>>> &#x09; gets written as&amp;#x09; replacing the&  value.
>>>
>>> What I need finally is&#x09; written out to xml file. Alternatively, Is
>>> there any way to escape&  in the value of&#x09; getting replaced to&amp;
>>>
>>> - Ritesh
>>> --------------------------------------------------------------
>>>
>>> John Lilley wrote:
>>>      
>>>> I think that was the first line in the example:
>>>>        
>>>>>       pRow->setAttribute(L"description", L"\tThe value of PI");
>>>>>          
>>>> -----Original Message-----
>>>> From: Alberto Massari [mailto:Alberto.Massari@progress.com]
>>>> Sent: Wednesday, April 07, 2010 1:10 AM
>>>> To: c-users@xerces.apache.org
>>>> Subject: Re: Xerces CPP does not replace tab character in attribute
>>>> value
>>>> with hexadecimal char reference
>>>>
>>>> If you invoke
>>>>
>>>> pRow1->setAttribute(L"description", L"&#x09;The vale of PI");
>>>>
>>>> what you are writing is the literal "&#x09;", not a tab character. Have
>>>> you tried writing the same content as in the Java version,
>>>>
>>>> pRow1->setAttribute(L"description", L"test tab \t,     and \x09
>>>> as&#x09;
>>>> value" );
>>>>
>>>> Alberto
>>>>
>>>>
>>>> On 4/6/2010 11:20 PM, ritesh.dhope wrote:
>>>>        
>>>>> I have a sample xerces writer code to create an attribute with tab
>>>>> character
>>>>> in value. When I write the xml file out, the tab character is
>>>>> preserved
>>>>> and
>>>>> written as literal tab. I want to convert this tab character to its
>>>>> hexadecimal char reference (&#x9;) so that this tab character can be
>>>>> parsed
>>>>> back. I tried the same code using Xerces-J and it is replacing the tab
>>>>> character with hexadecimal char reference.
>>>>>
>>>>> Here is the code snippets.
>>>>> -------------------
>>>>> C++
>>>>> -------------------
>>>>>       XMLPlatformUtils::Initialize();
>>>>>
>>>>>       DOMImplementation * pDOMImplementation = NULL;
>>>>>
>>>>>       pDOMImplementation =
>>>>> DOMImplementationRegistry::getDOMImplementation(XMLString::transcode("core"));
>>>>>
>>>>>       DOMDocument * pDOMDocument = NULL;
>>>>>
>>>>>
>>>>>       pDOMDocument = pDOMImplementation->createDocument(0,
>>>>> L"Hello_World", 0);
>>>>>
>>>>>
>>>>>
>>>>>       DOMElement * pRootElement = NULL;
>>>>>       pRootElement = pDOMDocument->getDocumentElement();
>>>>>
>>>>>
>>>>>
>>>>>       DOMElement * pRow = NULL;
>>>>>
>>>>>
>>>>>       pRow = pDOMDocument->createElement(L"row");
>>>>>       pRow->setAttribute(L"description", L"\tThe value of PI");
>>>>>       pRootElement->appendChild(pRow);
>>>>>
>>>>>       DOMElement * pRow1 = NULL;
>>>>>       pRow1 = pDOMDocument->createElement(L"row");
>>>>>       pRow1->setAttribute(L"description", L"&#x09;The vale of PI");
>>>>>       pRootElement->appendChild(pRow1);
>>>>>
>>>>>       DOMElement * pRow2 = NULL;
>>>>>
>>>>>
>>>>>       pRow2 = pDOMDocument->createElement(L"row");
>>>>>       pRow2->setAttribute(L"description", L"\nThe value of PI");
>>>>>       pRootElement->appendChild(pRow2);
>>>>>
>>>>>       DOMWriter *pwriter = pDOMImplementation->createDOMWriter();
>>>>>       XMLFormatTarget *pTarget = new
>>>>> LocalFileFormatTarget("f:\\june1\\june.xml");
>>>>>
>>>>>
>>>>>       pwriter->writeNode(pTarget, *pDOMDocument);
>>>>> -------------------------------------------------------------------------
>>>>>
>>>>> ------------------
>>>>> Java
>>>>> ------------------
>>>>>               Document zTest =
>>>>> DocumentBuilderFactoryImpl.newInstance().newDocumentBuilder().newDocument();
>>>>>               Element base = zTest.createElement( "Base" );
>>>>>               base.setAttribute( "test", "test tab \t,         and
>>>>> \u0009
>>>>> as&#x09;
>>>>> value" );
>>>>>               zTest.appendChild( base );
>>>>>
>>>>>               Serializer z = new XMLSerializer();
>>>>>
>>>>>               z.setOutputByteStream( System.out );
>>>>>               z.asDOMSerializer().serialize( zTest );
>>>>> ----------------------------------------------------------------------
>>>>>
>>>>>          
>>>>
>>>>
>>>>        
>>> --
>>> View this message in context:
>>> http://old.nabble.com/Xerces-CPP-does-not-replace-tab-character-in-attribute-value-with-hexadecimal-char-reference-tp28157670p28165406.html
>>> Sent from the Xerces - C - Users mailing list archive at Nabble.com.
>>>
>>>
>>>
>>>      
>>    
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/Xerces-CPP-does-not-replace-tab-character-in-attribute-value-with-hexadecimal-char-reference-tp28157670p28167052.html
Sent from the Xerces - C - Users mailing list archive at Nabble.com.


Re: Xerces CPP does not replace tab character in attribute value with hexadecimal char reference

Posted by Alberto Massari <Al...@progress.com>.
What version of Xerces are you using? If it's not a 3.x version, you are 
hitting the bug described at 
http://issues.apache.org/jira/browse/XERCESC-1547

Alberto

On 4/7/2010 5:15 PM, ritesh.dhope wrote:
> I don't mind taking that, but there is descrepency in how XercesC and XercesJ
> behaves. XercesC writes tab character as literal tab whereas XercesJ writes
> it to Hex char reference code.
>
> Writing a literal tab is bad idea as parser will convert it to space when it
> reads xml file. Hex code is read back as tab character by parser.
>
> - Ritesh
>
>
>
> John Lilley wrote:
>    
>> Perhaps this is good to read.  I am not expert enough to say, but it seems
>> to me that tabs in attribute values may not be preserved:
>> http://www.w3.org/TR/REC-xml/#AVNormalize
>>
>> john
>>
>> -----Original Message-----
>> From: ritesh.dhope [mailto:ritesh.dhope@siemens.com]
>> Sent: Wednesday, April 07, 2010 8:00 AM
>> To: c-users@xerces.apache.org
>> Subject: RE: Xerces CPP does not replace tab character in attribute value
>> with hexadecimal char reference
>>
>>
>> I tried writing \t as well as&#x09. \t gets written as literal tab
>> whereas
>> &#x09; gets written as&amp;#x09; replacing the&  value.
>>
>> What I need finally is&#x09; written out to xml file. Alternatively, Is
>> there any way to escape&  in the value of&#x09; getting replaced to&amp;
>>
>> - Ritesh
>> --------------------------------------------------------------
>>
>> John Lilley wrote:
>>      
>>> I think that was the first line in the example:
>>>        
>>>>       pRow->setAttribute(L"description", L"\tThe value of PI");
>>>>          
>>> -----Original Message-----
>>> From: Alberto Massari [mailto:Alberto.Massari@progress.com]
>>> Sent: Wednesday, April 07, 2010 1:10 AM
>>> To: c-users@xerces.apache.org
>>> Subject: Re: Xerces CPP does not replace tab character in attribute value
>>> with hexadecimal char reference
>>>
>>> If you invoke
>>>
>>> pRow1->setAttribute(L"description", L"&#x09;The vale of PI");
>>>
>>> what you are writing is the literal "&#x09;", not a tab character. Have
>>> you tried writing the same content as in the Java version,
>>>
>>> pRow1->setAttribute(L"description", L"test tab \t,     and \x09 as&#x09;
>>> value" );
>>>
>>> Alberto
>>>
>>>
>>> On 4/6/2010 11:20 PM, ritesh.dhope wrote:
>>>        
>>>> I have a sample xerces writer code to create an attribute with tab
>>>> character
>>>> in value. When I write the xml file out, the tab character is preserved
>>>> and
>>>> written as literal tab. I want to convert this tab character to its
>>>> hexadecimal char reference (&#x9;) so that this tab character can be
>>>> parsed
>>>> back. I tried the same code using Xerces-J and it is replacing the tab
>>>> character with hexadecimal char reference.
>>>>
>>>> Here is the code snippets.
>>>> -------------------
>>>> C++
>>>> -------------------
>>>>       XMLPlatformUtils::Initialize();
>>>>
>>>>       DOMImplementation * pDOMImplementation = NULL;
>>>>
>>>>       pDOMImplementation =
>>>> DOMImplementationRegistry::getDOMImplementation(XMLString::transcode("core"));
>>>>
>>>>       DOMDocument * pDOMDocument = NULL;
>>>>
>>>>
>>>>       pDOMDocument = pDOMImplementation->createDocument(0,
>>>> L"Hello_World", 0);
>>>>
>>>>
>>>>
>>>>       DOMElement * pRootElement = NULL;
>>>>       pRootElement = pDOMDocument->getDocumentElement();
>>>>
>>>>
>>>>
>>>>       DOMElement * pRow = NULL;
>>>>
>>>>
>>>>       pRow = pDOMDocument->createElement(L"row");
>>>>       pRow->setAttribute(L"description", L"\tThe value of PI");
>>>>       pRootElement->appendChild(pRow);
>>>>
>>>>       DOMElement * pRow1 = NULL;
>>>>       pRow1 = pDOMDocument->createElement(L"row");
>>>>       pRow1->setAttribute(L"description", L"&#x09;The vale of PI");
>>>>       pRootElement->appendChild(pRow1);
>>>>
>>>>       DOMElement * pRow2 = NULL;
>>>>
>>>>
>>>>       pRow2 = pDOMDocument->createElement(L"row");
>>>>       pRow2->setAttribute(L"description", L"\nThe value of PI");
>>>>       pRootElement->appendChild(pRow2);
>>>>
>>>>       DOMWriter *pwriter = pDOMImplementation->createDOMWriter();
>>>>       XMLFormatTarget *pTarget = new
>>>> LocalFileFormatTarget("f:\\june1\\june.xml");
>>>>
>>>>
>>>>       pwriter->writeNode(pTarget, *pDOMDocument);
>>>> -------------------------------------------------------------------------
>>>>
>>>> ------------------
>>>> Java
>>>> ------------------
>>>>               Document zTest =
>>>> DocumentBuilderFactoryImpl.newInstance().newDocumentBuilder().newDocument();
>>>>               Element base = zTest.createElement( "Base" );
>>>>               base.setAttribute( "test", "test tab \t,         and \u0009
>>>> as&#x09;
>>>> value" );
>>>>               zTest.appendChild( base );
>>>>
>>>>               Serializer z = new XMLSerializer();
>>>>
>>>>               z.setOutputByteStream( System.out );
>>>>               z.asDOMSerializer().serialize( zTest );
>>>> ----------------------------------------------------------------------
>>>>
>>>>          
>>>
>>>
>>>        
>> --
>> View this message in context:
>> http://old.nabble.com/Xerces-CPP-does-not-replace-tab-character-in-attribute-value-with-hexadecimal-char-reference-tp28157670p28165406.html
>> Sent from the Xerces - C - Users mailing list archive at Nabble.com.
>>
>>
>>
>>      
>    


RE: Xerces CPP does not replace tab character in attribute value with hexadecimal char reference

Posted by "ritesh.dhope" <ri...@siemens.com>.
I don't mind taking that, but there is descrepency in how XercesC and XercesJ
behaves. XercesC writes tab character as literal tab whereas XercesJ writes
it to Hex char reference code.

Writing a literal tab is bad idea as parser will convert it to space when it
reads xml file. Hex code is read back as tab character by parser.

- Ritesh



John Lilley wrote:
> 
> Perhaps this is good to read.  I am not expert enough to say, but it seems
> to me that tabs in attribute values may not be preserved:
> http://www.w3.org/TR/REC-xml/#AVNormalize
> 
> john
> 
> -----Original Message-----
> From: ritesh.dhope [mailto:ritesh.dhope@siemens.com]
> Sent: Wednesday, April 07, 2010 8:00 AM
> To: c-users@xerces.apache.org
> Subject: RE: Xerces CPP does not replace tab character in attribute value
> with hexadecimal char reference
> 
> 
> I tried writing \t as well as &#x09. \t gets written as literal tab
> whereas
> &#x09; gets written as &amp;#x09; replacing the & value.
> 
> What I need finally is &#x09; written out to xml file. Alternatively, Is
> there any way to escape & in the value of &#x09; getting replaced to &amp;
> 
> - Ritesh
> --------------------------------------------------------------
> 
> John Lilley wrote:
>>
>> I think that was the first line in the example:
>>>      pRow->setAttribute(L"description", L"\tThe value of PI");
>>
>> -----Original Message-----
>> From: Alberto Massari [mailto:Alberto.Massari@progress.com]
>> Sent: Wednesday, April 07, 2010 1:10 AM
>> To: c-users@xerces.apache.org
>> Subject: Re: Xerces CPP does not replace tab character in attribute value
>> with hexadecimal char reference
>>
>> If you invoke
>>
>> pRow1->setAttribute(L"description", L"&#x09;The vale of PI");
>>
>> what you are writing is the literal "&#x09;", not a tab character. Have
>> you tried writing the same content as in the Java version,
>>
>> pRow1->setAttribute(L"description", L"test tab \t,     and \x09 as&#x09;
>> value" );
>>
>> Alberto
>>
>>
>> On 4/6/2010 11:20 PM, ritesh.dhope wrote:
>>> I have a sample xerces writer code to create an attribute with tab
>>> character
>>> in value. When I write the xml file out, the tab character is preserved
>>> and
>>> written as literal tab. I want to convert this tab character to its
>>> hexadecimal char reference (&#x9;) so that this tab character can be
>>> parsed
>>> back. I tried the same code using Xerces-J and it is replacing the tab
>>> character with hexadecimal char reference.
>>>
>>> Here is the code snippets.
>>> -------------------
>>> C++
>>> -------------------
>>>      XMLPlatformUtils::Initialize();
>>>
>>>      DOMImplementation * pDOMImplementation = NULL;
>>>
>>>      pDOMImplementation =
>>> DOMImplementationRegistry::getDOMImplementation(XMLString::transcode("core"));
>>>
>>>      DOMDocument * pDOMDocument = NULL;
>>>
>>>
>>>      pDOMDocument = pDOMImplementation->createDocument(0,
>>> L"Hello_World", 0);
>>>
>>>
>>>
>>>      DOMElement * pRootElement = NULL;
>>>      pRootElement = pDOMDocument->getDocumentElement();
>>>
>>>
>>>
>>>      DOMElement * pRow = NULL;
>>>
>>>
>>>      pRow = pDOMDocument->createElement(L"row");
>>>      pRow->setAttribute(L"description", L"\tThe value of PI");
>>>      pRootElement->appendChild(pRow);
>>>
>>>      DOMElement * pRow1 = NULL;
>>>      pRow1 = pDOMDocument->createElement(L"row");
>>>      pRow1->setAttribute(L"description", L"&#x09;The vale of PI");
>>>      pRootElement->appendChild(pRow1);
>>>
>>>      DOMElement * pRow2 = NULL;
>>>
>>>
>>>      pRow2 = pDOMDocument->createElement(L"row");
>>>      pRow2->setAttribute(L"description", L"\nThe value of PI");
>>>      pRootElement->appendChild(pRow2);
>>>
>>>      DOMWriter *pwriter = pDOMImplementation->createDOMWriter();
>>>      XMLFormatTarget *pTarget = new
>>> LocalFileFormatTarget("f:\\june1\\june.xml");
>>>
>>>
>>>      pwriter->writeNode(pTarget, *pDOMDocument);
>>> -------------------------------------------------------------------------
>>>
>>> ------------------
>>> Java
>>> ------------------
>>>              Document zTest =
>>> DocumentBuilderFactoryImpl.newInstance().newDocumentBuilder().newDocument();
>>>              Element base = zTest.createElement( "Base" );
>>>              base.setAttribute( "test", "test tab \t,         and \u0009
>>> as&#x09;
>>> value" );
>>>              zTest.appendChild( base );
>>>
>>>              Serializer z = new XMLSerializer();
>>>
>>>              z.setOutputByteStream( System.out );
>>>              z.asDOMSerializer().serialize( zTest );
>>> ----------------------------------------------------------------------
>>>
>>
>>
>>
> 
> --
> View this message in context:
> http://old.nabble.com/Xerces-CPP-does-not-replace-tab-character-in-attribute-value-with-hexadecimal-char-reference-tp28157670p28165406.html
> Sent from the Xerces - C - Users mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/Xerces-CPP-does-not-replace-tab-character-in-attribute-value-with-hexadecimal-char-reference-tp28157670p28166497.html
Sent from the Xerces - C - Users mailing list archive at Nabble.com.


RE: Xerces CPP does not replace tab character in attribute value with hexadecimal char reference

Posted by John Lilley <jl...@datalever.com>.
Perhaps this is good to read.  I am not expert enough to say, but it seems to me that tabs in attribute values may not be preserved:
http://www.w3.org/TR/REC-xml/#AVNormalize

john

-----Original Message-----
From: ritesh.dhope [mailto:ritesh.dhope@siemens.com]
Sent: Wednesday, April 07, 2010 8:00 AM
To: c-users@xerces.apache.org
Subject: RE: Xerces CPP does not replace tab character in attribute value with hexadecimal char reference


I tried writing \t as well as &#x09. \t gets written as literal tab whereas
&#x09; gets written as &amp;#x09; replacing the & value.

What I need finally is &#x09; written out to xml file. Alternatively, Is
there any way to escape & in the value of &#x09; getting replaced to &amp;

- Ritesh
--------------------------------------------------------------

John Lilley wrote:
>
> I think that was the first line in the example:
>>      pRow->setAttribute(L"description", L"\tThe value of PI");
>
> -----Original Message-----
> From: Alberto Massari [mailto:Alberto.Massari@progress.com]
> Sent: Wednesday, April 07, 2010 1:10 AM
> To: c-users@xerces.apache.org
> Subject: Re: Xerces CPP does not replace tab character in attribute value
> with hexadecimal char reference
>
> If you invoke
>
> pRow1->setAttribute(L"description", L"&#x09;The vale of PI");
>
> what you are writing is the literal "&#x09;", not a tab character. Have
> you tried writing the same content as in the Java version,
>
> pRow1->setAttribute(L"description", L"test tab \t,     and \x09 as&#x09;
> value" );
>
> Alberto
>
>
> On 4/6/2010 11:20 PM, ritesh.dhope wrote:
>> I have a sample xerces writer code to create an attribute with tab
>> character
>> in value. When I write the xml file out, the tab character is preserved
>> and
>> written as literal tab. I want to convert this tab character to its
>> hexadecimal char reference (&#x9;) so that this tab character can be
>> parsed
>> back. I tried the same code using Xerces-J and it is replacing the tab
>> character with hexadecimal char reference.
>>
>> Here is the code snippets.
>> -------------------
>> C++
>> -------------------
>>      XMLPlatformUtils::Initialize();
>>
>>      DOMImplementation * pDOMImplementation = NULL;
>>
>>      pDOMImplementation =
>> DOMImplementationRegistry::getDOMImplementation(XMLString::transcode("core"));
>>
>>      DOMDocument * pDOMDocument = NULL;
>>
>>
>>      pDOMDocument = pDOMImplementation->createDocument(0, L"Hello_World", 0);
>>
>>
>>
>>      DOMElement * pRootElement = NULL;
>>      pRootElement = pDOMDocument->getDocumentElement();
>>
>>
>>
>>      DOMElement * pRow = NULL;
>>
>>
>>      pRow = pDOMDocument->createElement(L"row");
>>      pRow->setAttribute(L"description", L"\tThe value of PI");
>>      pRootElement->appendChild(pRow);
>>
>>      DOMElement * pRow1 = NULL;
>>      pRow1 = pDOMDocument->createElement(L"row");
>>      pRow1->setAttribute(L"description", L"&#x09;The vale of PI");
>>      pRootElement->appendChild(pRow1);
>>
>>      DOMElement * pRow2 = NULL;
>>
>>
>>      pRow2 = pDOMDocument->createElement(L"row");
>>      pRow2->setAttribute(L"description", L"\nThe value of PI");
>>      pRootElement->appendChild(pRow2);
>>
>>      DOMWriter *pwriter = pDOMImplementation->createDOMWriter();
>>      XMLFormatTarget *pTarget = new
>> LocalFileFormatTarget("f:\\june1\\june.xml");
>>
>>
>>      pwriter->writeNode(pTarget, *pDOMDocument);
>> -------------------------------------------------------------------------
>>
>> ------------------
>> Java
>> ------------------
>>              Document zTest =
>> DocumentBuilderFactoryImpl.newInstance().newDocumentBuilder().newDocument();
>>              Element base = zTest.createElement( "Base" );
>>              base.setAttribute( "test", "test tab \t,         and \u0009
>> as&#x09;
>> value" );
>>              zTest.appendChild( base );
>>
>>              Serializer z = new XMLSerializer();
>>
>>              z.setOutputByteStream( System.out );
>>              z.asDOMSerializer().serialize( zTest );
>> ----------------------------------------------------------------------
>>
>
>
>

--
View this message in context: http://old.nabble.com/Xerces-CPP-does-not-replace-tab-character-in-attribute-value-with-hexadecimal-char-reference-tp28157670p28165406.html
Sent from the Xerces - C - Users mailing list archive at Nabble.com.


RE: Xerces CPP does not replace tab character in attribute value with hexadecimal char reference

Posted by "ritesh.dhope" <ri...@siemens.com>.
I tried writing \t as well as &#x09. \t gets written as literal tab whereas
&#x09; gets written as &amp;#x09; replacing the & value. 

What I need finally is &#x09; written out to xml file. Alternatively, Is
there any way to escape & in the value of &#x09; getting replaced to &amp;

- Ritesh
--------------------------------------------------------------

John Lilley wrote:
> 
> I think that was the first line in the example:
>>      pRow->setAttribute(L"description", L"\tThe value of PI");
> 
> -----Original Message-----
> From: Alberto Massari [mailto:Alberto.Massari@progress.com] 
> Sent: Wednesday, April 07, 2010 1:10 AM
> To: c-users@xerces.apache.org
> Subject: Re: Xerces CPP does not replace tab character in attribute value
> with hexadecimal char reference
> 
> If you invoke
> 
> pRow1->setAttribute(L"description", L"&#x09;The vale of PI");
> 
> what you are writing is the literal "&#x09;", not a tab character. Have
> you tried writing the same content as in the Java version,
> 
> pRow1->setAttribute(L"description", L"test tab \t, 	 and \x09 as&#x09;
> value" );
> 
> Alberto
> 
> 
> On 4/6/2010 11:20 PM, ritesh.dhope wrote:
>> I have a sample xerces writer code to create an attribute with tab
>> character
>> in value. When I write the xml file out, the tab character is preserved
>> and
>> written as literal tab. I want to convert this tab character to its
>> hexadecimal char reference (&#x9;) so that this tab character can be
>> parsed
>> back. I tried the same code using Xerces-J and it is replacing the tab
>> character with hexadecimal char reference.
>>
>> Here is the code snippets.
>> -------------------
>> C++
>> -------------------
>> 	XMLPlatformUtils::Initialize();
>>
>> 	DOMImplementation * pDOMImplementation = NULL;
>>
>> 	pDOMImplementation =
>> DOMImplementationRegistry::getDOMImplementation(XMLString::transcode("core"));
>>
>> 	DOMDocument * pDOMDocument = NULL;
>>
>>
>> 	pDOMDocument = pDOMImplementation->createDocument(0, L"Hello_World", 0);
>>
>>
>>
>> 	DOMElement * pRootElement = NULL;
>> 	pRootElement = pDOMDocument->getDocumentElement();
>>
>>
>>
>> 	DOMElement * pRow = NULL;
>>
>>
>> 	pRow = pDOMDocument->createElement(L"row");
>>      pRow->setAttribute(L"description", L"\tThe value of PI");
>> 	pRootElement->appendChild(pRow);
>>
>> 	DOMElement * pRow1 = NULL;
>> 	pRow1 = pDOMDocument->createElement(L"row");
>> 	pRow1->setAttribute(L"description", L"&#x09;The vale of PI");
>> 	pRootElement->appendChild(pRow1);
>>
>> 	DOMElement * pRow2 = NULL;
>>
>>
>> 	pRow2 = pDOMDocument->createElement(L"row");
>>      pRow2->setAttribute(L"description", L"\nThe value of PI");
>> 	pRootElement->appendChild(pRow2);
>>
>> 	DOMWriter *pwriter = pDOMImplementation->createDOMWriter();
>> 	XMLFormatTarget *pTarget = new
>> LocalFileFormatTarget("f:\\june1\\june.xml");
>>
>>
>> 	pwriter->writeNode(pTarget, *pDOMDocument);
>> -------------------------------------------------------------------------
>>
>> ------------------
>> Java
>> ------------------
>>              Document zTest =
>> DocumentBuilderFactoryImpl.newInstance().newDocumentBuilder().newDocument();
>>              Element base = zTest.createElement( "Base" );
>>              base.setAttribute( "test", "test tab \t, 	 and \u0009
>> as&#x09;
>> value" );
>>              zTest.appendChild( base );
>>
>>              Serializer z = new XMLSerializer();
>>
>>              z.setOutputByteStream( System.out );
>>              z.asDOMSerializer().serialize( zTest );
>> ----------------------------------------------------------------------
>>    
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/Xerces-CPP-does-not-replace-tab-character-in-attribute-value-with-hexadecimal-char-reference-tp28157670p28165406.html
Sent from the Xerces - C - Users mailing list archive at Nabble.com.


RE: Xerces CPP does not replace tab character in attribute value with hexadecimal char reference

Posted by John Lilley <jl...@datalever.com>.
I think that was the first line in the example:
>      pRow->setAttribute(L"description", L"\tThe value of PI");

-----Original Message-----
From: Alberto Massari [mailto:Alberto.Massari@progress.com] 
Sent: Wednesday, April 07, 2010 1:10 AM
To: c-users@xerces.apache.org
Subject: Re: Xerces CPP does not replace tab character in attribute value with hexadecimal char reference

If you invoke

pRow1->setAttribute(L"description", L"&#x09;The vale of PI");

what you are writing is the literal "&#x09;", not a tab character. Have you tried writing the same content as in the Java version,

pRow1->setAttribute(L"description", L"test tab \t, 	 and \x09 as&#x09; value" );

Alberto


On 4/6/2010 11:20 PM, ritesh.dhope wrote:
> I have a sample xerces writer code to create an attribute with tab character
> in value. When I write the xml file out, the tab character is preserved and
> written as literal tab. I want to convert this tab character to its
> hexadecimal char reference (&#x9;) so that this tab character can be parsed
> back. I tried the same code using Xerces-J and it is replacing the tab
> character with hexadecimal char reference.
>
> Here is the code snippets.
> -------------------
> C++
> -------------------
> 	XMLPlatformUtils::Initialize();
>
> 	DOMImplementation * pDOMImplementation = NULL;
>
> 	pDOMImplementation =
> DOMImplementationRegistry::getDOMImplementation(XMLString::transcode("core"));
>
> 	DOMDocument * pDOMDocument = NULL;
>
>
> 	pDOMDocument = pDOMImplementation->createDocument(0, L"Hello_World", 0);
>
>
>
> 	DOMElement * pRootElement = NULL;
> 	pRootElement = pDOMDocument->getDocumentElement();
>
>
>
> 	DOMElement * pRow = NULL;
>
>
> 	pRow = pDOMDocument->createElement(L"row");
>      pRow->setAttribute(L"description", L"\tThe value of PI");
> 	pRootElement->appendChild(pRow);
>
> 	DOMElement * pRow1 = NULL;
> 	pRow1 = pDOMDocument->createElement(L"row");
> 	pRow1->setAttribute(L"description", L"&#x09;The vale of PI");
> 	pRootElement->appendChild(pRow1);
>
> 	DOMElement * pRow2 = NULL;
>
>
> 	pRow2 = pDOMDocument->createElement(L"row");
>      pRow2->setAttribute(L"description", L"\nThe value of PI");
> 	pRootElement->appendChild(pRow2);
>
> 	DOMWriter *pwriter = pDOMImplementation->createDOMWriter();
> 	XMLFormatTarget *pTarget = new
> LocalFileFormatTarget("f:\\june1\\june.xml");
>
>
> 	pwriter->writeNode(pTarget, *pDOMDocument);
> -------------------------------------------------------------------------
>
> ------------------
> Java
> ------------------
>              Document zTest =
> DocumentBuilderFactoryImpl.newInstance().newDocumentBuilder().newDocument();
>              Element base = zTest.createElement( "Base" );
>              base.setAttribute( "test", "test tab \t, 	 and \u0009 as&#x09;
> value" );
>              zTest.appendChild( base );
>
>              Serializer z = new XMLSerializer();
>
>              z.setOutputByteStream( System.out );
>              z.asDOMSerializer().serialize( zTest );
> ----------------------------------------------------------------------
>    


Re: Xerces CPP does not replace tab character in attribute value with hexadecimal char reference

Posted by Alberto Massari <Al...@progress.com>.
If you invoke

pRow1->setAttribute(L"description", L"&#x09;The vale of PI");

what you are writing is the literal "&#x09;", not a tab character. Have you tried writing the same content as in the Java version,

pRow1->setAttribute(L"description", L"test tab \t, 	 and \x09 as&#x09; value" );

Alberto


On 4/6/2010 11:20 PM, ritesh.dhope wrote:
> I have a sample xerces writer code to create an attribute with tab character
> in value. When I write the xml file out, the tab character is preserved and
> written as literal tab. I want to convert this tab character to its
> hexadecimal char reference (&#x9;) so that this tab character can be parsed
> back. I tried the same code using Xerces-J and it is replacing the tab
> character with hexadecimal char reference.
>
> Here is the code snippets.
> -------------------
> C++
> -------------------
> 	XMLPlatformUtils::Initialize();
>
> 	DOMImplementation * pDOMImplementation = NULL;
>
> 	pDOMImplementation =
> DOMImplementationRegistry::getDOMImplementation(XMLString::transcode("core"));
>
> 	DOMDocument * pDOMDocument = NULL;
>
>
> 	pDOMDocument = pDOMImplementation->createDocument(0, L"Hello_World", 0);
>
>
>
> 	DOMElement * pRootElement = NULL;
> 	pRootElement = pDOMDocument->getDocumentElement();
>
>
>
> 	DOMElement * pRow = NULL;
>
>
> 	pRow = pDOMDocument->createElement(L"row");
>      pRow->setAttribute(L"description", L"\tThe value of PI");
> 	pRootElement->appendChild(pRow);
>
> 	DOMElement * pRow1 = NULL;
> 	pRow1 = pDOMDocument->createElement(L"row");
> 	pRow1->setAttribute(L"description", L"&#x09;The vale of PI");
> 	pRootElement->appendChild(pRow1);
>
> 	DOMElement * pRow2 = NULL;
>
>
> 	pRow2 = pDOMDocument->createElement(L"row");
>      pRow2->setAttribute(L"description", L"\nThe value of PI");
> 	pRootElement->appendChild(pRow2);
>
> 	DOMWriter *pwriter = pDOMImplementation->createDOMWriter();
> 	XMLFormatTarget *pTarget = new
> LocalFileFormatTarget("f:\\june1\\june.xml");
>
>
> 	pwriter->writeNode(pTarget, *pDOMDocument);
> -------------------------------------------------------------------------
>
> ------------------
> Java
> ------------------
>              Document zTest =
> DocumentBuilderFactoryImpl.newInstance().newDocumentBuilder().newDocument();
>              Element base = zTest.createElement( "Base" );
>              base.setAttribute( "test", "test tab \t, 	 and \u0009 as&#x09;
> value" );
>              zTest.appendChild( base );
>
>              Serializer z = new XMLSerializer();
>
>              z.setOutputByteStream( System.out );
>              z.asDOMSerializer().serialize( zTest );
> ----------------------------------------------------------------------
>