You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@qpid.apache.org by Olivier Delbeke <Ol...@awtce.be> on 2017/12/21 16:51:27 UTC

Unexplained b" and " around message body with python and java, but not with C++

 Hi all,

When I send messages to a queue using the java or python APIs, the message body I receive on the other side (C++ binding) is preceded by b" and appended with ".
When I send the same message with the C++ API, I do not have this behaviour.

Example: 
[C++] sender.send( proton::message("[MESSAGE]") )
--> results in [MESSAGE] in C++ ( to_string(message.body()) )

[python] sender.send( Message(body="[MESSAGE]") ) 
--> results in b"[MESSAGE]" in C++ ( to_string(message.body()) ) => b and " are added at the beginning, and " is added at the end.

I tried many things like forcing the "content_type" and the "content_encoding", or changing the "inferred" flag, but nothing helped. These annoying b" and " are still there.
Looking at the source code of qpid proton, I also do not find where they could be added or why.

(BTW, I use qpid-proton 0.18.0)

Any idea ?

Olivier


Re: Unexplained b" and " around message body with python and java, but not with C++

Posted by Olivier Delbeke <Ol...@awtce.be>.
Thank you Chris,

Unfortunately, what happens here is still not completely clear to me.
When I look at the binary data being sent (using PN_TRACE_FRM=1), I notice that the b" never appears in the message being sent (not with C++ and not with python), and that the body data itself is identical whatever binding I'm using to send it. My conclusion is that the only possibility for the message being decoded differently, is that there is something in the message header telling the qpid-proton C or C++ library that the message body is binary ; and that would lead to the eventual addition of b" by C++ when converting it to string (in to_string()). Does that make sense ?
If this is correct, then wouldn't it be possible to modify the message header in order to avoid this (specific value for content_encoding?) ? The messages I'm sending are JSON string consisting of only basic ASCII characters, so in my case the unicode (UTF-8) data == binary data.

I will try the u"XXX" tomorrow morning and keep you informed.

Best regards,
Olivier

-----"Chris Richardson" <cr...@fourc.eu> wrote: -----

>To: "users" <us...@qpid.apache.org>
>From: "Chris Richardson" <cr...@fourc.eu>
>Date: 12/21/2017 06:44PM
>Subject: Re: Unexplained b" and " around message body with python and
>java, but not with C++
>
>Hi Olivier,
>
>The 'b' prefix in this case is part of the automatic character
>encoding of
>the language in use, indicating that the string is a "bytes" object
>rather
>than a Unicode string which is the default expectation of the C++.
>Have a
>browse of the relevant character encoding documentation for the
>language in
>question, for instance
>https://www.pythoncentral.io/python-unicode-encode-decode-strings-pyt
>hon-2x/
>for python. For the simple case of a string literal you could try
>assigning
>your string with Message(u"[MESSAGE]").
>
>Hope that helps
>
>Chris
>
>
>
>On 21 December 2017 at 16:51, Olivier Delbeke
><Ol...@awtce.be>
>wrote:
>
>>  Hi all,
>>
>> When I send messages to a queue using the java or python APIs, the
>message
>> body I receive on the other side (C++ binding) is preceded by b"
>and
>> appended with ".
>> When I send the same message with the C++ API, I do not have this
>> behaviour.
>>
>> Example:
>> [C++] sender.send( proton::message("[MESSAGE]") )
>> --> results in [MESSAGE] in C++ ( to_string(message.body()) )
>>
>> [python] sender.send( Message(body="[MESSAGE]") )
>> --> results in b"[MESSAGE]" in C++ ( to_string(message.body()) ) =>
>b and
>> " are added at the beginning, and " is added at the end.
>>
>> I tried many things like forcing the "content_type" and the
>> "content_encoding", or changing the "inferred" flag, but nothing
>helped.
>> These annoying b" and " are still there.
>> Looking at the source code of qpid proton, I also do not find where
>they
>> could be added or why.
>>
>> (BTW, I use qpid-proton 0.18.0)
>>
>> Any idea ?
>>
>> Olivier
>>
>>
>
>
>-- 
>
>*Chris Richardson*, System Architect
>cr@fourc.eu
>
>
>*FourC AS, Vestre Rosten 81, Trekanten, NO-7075 Tiller,
>Norwaywww.fourc.eu
><http://www.fourc.eu/>*
>
>*Follow us on LinkedIn <http://bit.ly/fourcli>, Facebook
><http://bit.ly/fourcfb>, Google+ <http://bit.ly/fourcgp> and Twitter
><http://bit.ly/fourctw>!*
>

Re: Unexplained b" and " around message body with python and java, but not with C++

Posted by Alan Conway <ac...@redhat.com>.
A note on the C++ API: std::string  or `const char*` will be encoded as an
AMQP String (unicode), to encode as AMQP binary use the proton::binary type
(which is just a std::vector<uint8_t>) .

When decoding, you can say e.g. proton::get<std::string>(msg.body()) which
will throws unless the body type is exactly AMQP string, or
proton::coerce<std::string>(msg.body()) which will succeed if the body is
"string-like" i.e. AMQP string, symbol or binary. Check the documentation
of the proton::value type for more detail.

On Fri, Dec 22, 2017 at 4:10 AM, Gordon Sim <gs...@redhat.com> wrote:

> On 22/12/17 08:14, Olivier Delbeke wrote:
>
>>   Hi Chris & all,
>>
>> Your solution (defining the data as unicode) works :
>> [python] sender.send( Message(body="[MESSAGE]") )    =>
>> to_string(message.body())=="b"[MESSAGE]"" in C++ at the receiver side
>> [python] sender.send( Message(body=u"[MESSAGE]") )  =>
>> to_string(message.body())=="[MESSAGE]" in C++ at the receiver side
>>
>> The difference in the data being sent is minimal :
>> "[MESSAGE]"
>> [0x21dec70]:0 -> @transfer(20) [handle=0, delivery-id=0,
>> delivery-tag=b"1", message-format=0, settled=false, more=false,
>> resume=false, aborted=false, batchable=false] (77)
>> "\x00Sp\xd0\x00\x00\x00\x0b\x00\x00\x00\x05BP\x04@BR\x00\x00
>> Ss\xd0\x00\x00\x00$\x00\x00\x00\x0d@@@@@@\xa3\x00\xa3\x00\
>> x83\x00\x00\x00\x00\x00\x00\x00\x00\x83\x00\x00\x00\x00\x00\x00\x00\x00@R
>> \x00@\x00Sw\xa0\x09[MESSAGE]"
>> u"MESSAGE"
>> [0x1e61c70]:0 -> @transfer(20) [handle=0, delivery-id=0,
>> delivery-tag=b"1", message-format=0, settled=false, more=false,
>> resume=false, aborted=false, batchable=false] (77)
>> "\x00Sp\xd0\x00\x00\x00\x0b\x00\x00\x00\x05BP\x04@BR\x00\x00
>> Ss\xd0\x00\x00\x00$\x00\x00\x00\x0d@@@@@@\xa3\x00\xa3\x00\
>> x83\x00\x00\x00\x00\x00\x00\x00\x00\x83\x00\x00\x00\x00\x00\x00\x00\x00@R
>> \x00@\x00Sw\xa1\x09[MESSAGE]"
>>
>> If I'm reading correctly, I just see a \xa0 -> \xa1 (2 bytes before the
>> message)
>> Does this field represent the data encoding ?
>>
>
> Yes (0xa0 is binary, 0xa1 is string)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> For additional commands, e-mail: users-help@qpid.apache.org
>
>

Re: Unexplained b" and " around message body with python and java, but not with C++

Posted by Gordon Sim <gs...@redhat.com>.
On 22/12/17 08:14, Olivier Delbeke wrote:
>   Hi Chris & all,
> 
> Your solution (defining the data as unicode) works :
> [python] sender.send( Message(body="[MESSAGE]") )    => to_string(message.body())=="b"[MESSAGE]"" in C++ at the receiver side
> [python] sender.send( Message(body=u"[MESSAGE]") )  => to_string(message.body())=="[MESSAGE]" in C++ at the receiver side
> 
> The difference in the data being sent is minimal :
> "[MESSAGE]"
> [0x21dec70]:0 -> @transfer(20) [handle=0, delivery-id=0, delivery-tag=b"1", message-format=0, settled=false, more=false, resume=false, aborted=false, batchable=false] (77) "\x00Sp\xd0\x00\x00\x00\x0b\x00\x00\x00\x05BP\x04@BR\x00\x00Ss\xd0\x00\x00\x00$\x00\x00\x00\x0d@@@@@@\xa3\x00\xa3\x00\x83\x00\x00\x00\x00\x00\x00\x00\x00\x83\x00\x00\x00\x00\x00\x00\x00\x00@R\x00@\x00Sw\xa0\x09[MESSAGE]"
> u"MESSAGE"
> [0x1e61c70]:0 -> @transfer(20) [handle=0, delivery-id=0, delivery-tag=b"1", message-format=0, settled=false, more=false, resume=false, aborted=false, batchable=false] (77) "\x00Sp\xd0\x00\x00\x00\x0b\x00\x00\x00\x05BP\x04@BR\x00\x00Ss\xd0\x00\x00\x00$\x00\x00\x00\x0d@@@@@@\xa3\x00\xa3\x00\x83\x00\x00\x00\x00\x00\x00\x00\x00\x83\x00\x00\x00\x00\x00\x00\x00\x00@R\x00@\x00Sw\xa1\x09[MESSAGE]"
> 
> If I'm reading correctly, I just see a \xa0 -> \xa1 (2 bytes before the message)
> Does this field represent the data encoding ?

Yes (0xa0 is binary, 0xa1 is string)

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org


Re: Unexplained b" and " around message body with python and java, but not with C++

Posted by Olivier Delbeke <Ol...@awtce.be>.
 Hi Chris & all,

Your solution (defining the data as unicode) works : 
[python] sender.send( Message(body="[MESSAGE]") )    => to_string(message.body())=="b"[MESSAGE]"" in C++ at the receiver side
[python] sender.send( Message(body=u"[MESSAGE]") )  => to_string(message.body())=="[MESSAGE]" in C++ at the receiver side

The difference in the data being sent is minimal :
"[MESSAGE]"
[0x21dec70]:0 -> @transfer(20) [handle=0, delivery-id=0, delivery-tag=b"1", message-format=0, settled=false, more=false, resume=false, aborted=false, batchable=false] (77) "\x00Sp\xd0\x00\x00\x00\x0b\x00\x00\x00\x05BP\x04@BR\x00\x00Ss\xd0\x00\x00\x00$\x00\x00\x00\x0d@@@@@@\xa3\x00\xa3\x00\x83\x00\x00\x00\x00\x00\x00\x00\x00\x83\x00\x00\x00\x00\x00\x00\x00\x00@R\x00@\x00Sw\xa0\x09[MESSAGE]"
u"MESSAGE"
[0x1e61c70]:0 -> @transfer(20) [handle=0, delivery-id=0, delivery-tag=b"1", message-format=0, settled=false, more=false, resume=false, aborted=false, batchable=false] (77) "\x00Sp\xd0\x00\x00\x00\x0b\x00\x00\x00\x05BP\x04@BR\x00\x00Ss\xd0\x00\x00\x00$\x00\x00\x00\x0d@@@@@@\xa3\x00\xa3\x00\x83\x00\x00\x00\x00\x00\x00\x00\x00\x83\x00\x00\x00\x00\x00\x00\x00\x00@R\x00@\x00Sw\xa1\x09[MESSAGE]"

If I'm reading correctly, I just see a \xa0 -> \xa1 (2 bytes before the message)
Does this field represent the data encoding ?

Best regards,

-----Olivier Delbeke/AW EUROPE/BE wrote: -----
To: users@qpid.apache.org
From: Olivier Delbeke/AW EUROPE/BE
Date: 12/21/2017 08:09PM
Subject: Re: Unexplained b" and " around message body with python and java, but not with C++

Thank you Chris,

Unfortunately, what happens here is still not completely clear to me.
When I look at the binary data being sent (using PN_TRACE_FRM=1), I notice that the b" never appears in the message being sent (not with C++ and not with python), and that the body data itself is identical whatever binding I'm using to send it. My conclusion is that the only possibility for the message being decoded differently, is that there is something in the message header telling the qpid-proton C or C++ library that the message body is binary ; and that would lead to the eventual addition of b" by C++ when converting it to string (in to_string()). Does that make sense ?
If this is correct, then wouldn't it be possible to modify the message header in order to avoid this (specific value for content_encoding?) ? The messages I'm sending are JSON string consisting of only basic ASCII characters, so in my case the unicode (UTF-8) data == binary data.

I will try the u"XXX" tomorrow morning and keep you informed.

Best regards,
Olivier

-----"Chris Richardson" <cr...@fourc.eu> wrote: -----

>To: "users" <us...@qpid.apache.org>
>From: "Chris Richardson" <cr...@fourc.eu>
>Date: 12/21/2017 06:44PM
>Subject: Re: Unexplained b" and " around message body with python and
>java, but not with C++
>
>Hi Olivier,
>
>The 'b' prefix in this case is part of the automatic character
>encoding of
>the language in use, indicating that the string is a "bytes" object
>rather
>than a Unicode string which is the default expectation of the C++.
>Have a
>browse of the relevant character encoding documentation for the
>language in
>question, for instance
>https://www.pythoncentral.io/python-unicode-encode-decode-strings-pyt
>hon-2x/
>for python. For the simple case of a string literal you could try
>assigning
>your string with Message(u"[MESSAGE]").
>
>Hope that helps
>
>Chris
>
>
>
>On 21 December 2017 at 16:51, Olivier Delbeke
><Ol...@awtce.be>
>wrote:
>
>>  Hi all,
>>
>> When I send messages to a queue using the java or python APIs, the
>message
>> body I receive on the other side (C++ binding) is preceded by b"
>and
>> appended with ".
>> When I send the same message with the C++ API, I do not have this
>> behaviour.
>>
>> Example:
>> [C++] sender.send( proton::message("[MESSAGE]") )
>> --> results in [MESSAGE] in C++ ( to_string(message.body()) )
>>
>> [python] sender.send( Message(body="[MESSAGE]") )
>> --> results in b"[MESSAGE]" in C++ ( to_string(message.body()) ) =>
>b and
>> " are added at the beginning, and " is added at the end.
>>
>> I tried many things like forcing the "content_type" and the
>> "content_encoding", or changing the "inferred" flag, but nothing
>helped.
>> These annoying b" and " are still there.
>> Looking at the source code of qpid proton, I also do not find where
>they
>> could be added or why.
>>
>> (BTW, I use qpid-proton 0.18.0)
>>
>> Any idea ?
>>
>> Olivier
>>
>>
>
>
>-- 
>
>*Chris Richardson*, System Architect
>cr@fourc.eu
>
>
>*FourC AS, Vestre Rosten 81, Trekanten, NO-7075 Tiller,
>Norwaywww.fourc.eu
><http://www.fourc.eu/>*
>
>*Follow us on LinkedIn <http://bit.ly/fourcli>, Facebook
><http://bit.ly/fourcfb>, Google+ <http://bit.ly/fourcgp> and Twitter
><http://bit.ly/fourctw>!*
>
 

Re: Unexplained b" and " around message body with python and java, but not with C++

Posted by Chris Richardson <cr...@fourc.eu>.
Hi Olivier,

The 'b' prefix in this case is part of the automatic character encoding of
the language in use, indicating that the string is a "bytes" object rather
than a Unicode string which is the default expectation of the C++. Have a
browse of the relevant character encoding documentation for the language in
question, for instance
https://www.pythoncentral.io/python-unicode-encode-decode-strings-python-2x/
for python. For the simple case of a string literal you could try assigning
your string with Message(u"[MESSAGE]").

Hope that helps

Chris



On 21 December 2017 at 16:51, Olivier Delbeke <Ol...@awtce.be>
wrote:

>  Hi all,
>
> When I send messages to a queue using the java or python APIs, the message
> body I receive on the other side (C++ binding) is preceded by b" and
> appended with ".
> When I send the same message with the C++ API, I do not have this
> behaviour.
>
> Example:
> [C++] sender.send( proton::message("[MESSAGE]") )
> --> results in [MESSAGE] in C++ ( to_string(message.body()) )
>
> [python] sender.send( Message(body="[MESSAGE]") )
> --> results in b"[MESSAGE]" in C++ ( to_string(message.body()) ) => b and
> " are added at the beginning, and " is added at the end.
>
> I tried many things like forcing the "content_type" and the
> "content_encoding", or changing the "inferred" flag, but nothing helped.
> These annoying b" and " are still there.
> Looking at the source code of qpid proton, I also do not find where they
> could be added or why.
>
> (BTW, I use qpid-proton 0.18.0)
>
> Any idea ?
>
> Olivier
>
>


-- 

*Chris Richardson*, System Architect
cr@fourc.eu


*FourC AS, Vestre Rosten 81, Trekanten, NO-7075 Tiller, Norwaywww.fourc.eu
<http://www.fourc.eu/>*

*Follow us on LinkedIn <http://bit.ly/fourcli>, Facebook
<http://bit.ly/fourcfb>, Google+ <http://bit.ly/fourcgp> and Twitter
<http://bit.ly/fourctw>!*