You are viewing a plain text version of this content. The canonical link for it is here.
Posted to soap-user@ws.apache.org by "Pospisil, Pavel" <Po...@gedas.cz> on 2005/03/10 15:41:41 UTC
Problem with encoding String data
Hallo,
I am using SOAP 2.3.1 for our webservice client implementation
and now I am facing problem with sending two czech character "Ř" and "Á".
(all other czech characters are encoded and transmited without problems)
Our WS client is running with default encoding windows-1250,
but communication need to be UTF-8.
public class AuthData
{
private String m_login;
private String m_password;
public AuthData(String login, String password)
{
m_login = login;
m_password = password;
}
}
.....
public CiselnikPolozkaModel[] CiselnikModel(AuthData authData) throws
Exception
{
CiselnikPolozkaModel[] returnVal = null;
URL endpointURL = new URL(endpoint);
Call call = new Call();
call.setSOAPTransport(m_httpConnection);
call.setTargetObjectURI("urn:Autoplus");
call.setMethodName("CiselnikModel");
call.setEncodingStyleURI(Constants.NS_URI_SOAP_ENC);
Vector params = new Vector();
params.addElement(new Parameter("authData", AuthData.class, authData,
null));
call.setParams(params);
call.setSOAPMappingRegistry(m_smr);
Response response = call.invoke(endpointURL,
"urn:Autoplus#ws_autoplus#CiselnikModel");
if (!response.generatedFault())
{
Parameter result = response.getReturnValue();
returnVal = (CiselnikPolozkaModel[])result.getValue();
}
else
{
Fault fault = response.getFault();
throw new SOAPException(fault.getFaultCode(), fault.getFaultString());
}
return returnVal;
}
.....
Because of communication is with UTF-8 encoding, i need to encode czech
characters into UTF-8
before construct AuthData (or whatever other parameter with String in his
constructor)
otherwise i get something like <faultstring xsi:type="xsd:string">XML error
not well-formed (invalid token)</faultstring> from server.
I tried some primitive encoding for each String parameter like:
AuthData a = new AuthData(toUTF8("ŽŠČ"), toUTF8("Řach"))
private static String toUTF8(String input)
throws Exception
{
try
{
if (input != null)
{
return new String(input.getBytes("UTF8"));
}
else
return input;
}
catch(UnsupportedEncodingException e)
{
throw new Exception (e.getLocalizedMessage());
}
}
It is working good for all czech characters with exception "Ř" and "Á".
In these cases is conversion
Ř -> Ĺ?, Á -> Ă?
and server returns "not well-formed" exception to me.
Else I tried to encode string with
private static String toUTF8(String input)
throws Exception
{
try
{
if (input != null)
{
return input;
char[] ch = input.toCharArray();
StringBuffer sb = new StringBuffer();
for (int i = 0; i < ch.length; i++)
{
if ((int)ch[i] < 128)
sb.append(ch[i]) ;
else
{
sb.append("&#"+ (int)ch[i] +";");
}
}
return sb.toString();
}
else
{
return input;
}
}
catch(Exception e)
{
throw new Exception (e.getLocalizedMessage());
}
}
but in this case SOAP encode my ŘÁ to &#344;&#193;
server return no error, but transmited data are ŘÁ insted of "Ř",
"Á" (of course -:))
Please, help me with some hint how safely encode and decode String data
from default windows-1250 to UTF-8 and send to WS.
Sincerelly
Pavel