You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by choedebeck <cr...@yahoo.com> on 2016/09/14 16:01:20 UTC

Binary Data over the PutTCP processor

I have a ConsumeJMS processor feeding into a PutTCP processor.  The JMS
messages are binary messages.  I have "Connection Per FlowFile" setting to
true, and I am using that to determine when the message has been sent.

The "Outgoing Message Delimiter" is set to '\n', but I seem to be reading
conflicting information on if this is used if the Connection Per FlowFile
setting is true.

The issue is that it seems the binary data I am receiving is not the same as
the binary data I sent.  My guess is that the binary data is getting encoded
into the UTF-8 character set which is corrupting my data.

My question is, is there a way around this issue?  What is the use case for
sending arbitrary binary data over the PutTCP processor.



--
View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/Binary-Data-over-the-PutTCP-processor-tp13364.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: Binary Data over the PutTCP processor

Posted by Bryan Bende <bb...@gmail.com>.
Looks like someone did create a JIRA:

https://issues.apache.org/jira/browse/NIFI-2748

On Thu, Feb 9, 2017 at 12:26 PM, Ryan Ward <ry...@gmail.com> wrote:
> Bryan - I started to evaluate putTCP and agree Outgoing Message Delimiter
> should be optional. I'm currently having to splitText to 1 in order to use
> putTCP which doesn't seem like the most efficient approach. Was a JIRA ever
> created for this capability?
>
> On Wed, Sep 14, 2016 at 4:41 PM, Bryan Bende <bb...@gmail.com> wrote:
>
>> Hello,
>>
>> We should probably make "Outgoing Message Delimiter" optional when
>> selecting "Connection Per Flow".
>>
>> Do you want to create a JIRA for this?
>>
>> Thanks,
>>
>> Bryan
>>
>> On Wed, Sep 14, 2016 at 3:36 PM, McDermott, Chris Kevin (MSDU -
>> STaTS/StorefrontRemote) <ch...@hpe.com> wrote:
>>
>> > Hello friend,
>> >
>> > I’m not really a developer but I thought I might chime in while you wait
>> > for a more “official” answer.
>> >
>> > Reading the original JIRA for the PutTCP processor and glancing at the
>> > code, it seems to send the binary data unmolested.  The Character Set
>> > setting (e.g. UTF-8) is only to allow the proper binary conversion of the
>> > message delimiter to a byte sequence.  Yes, the documentation for the
>> > Outgoing Message Delimiter could certainly be more specific.
>> >
>> > Did you try sending the binary data without a message delimiter?
>> >
>> > I hope this helps.
>> >
>> > Chris McDermott
>> >
>> > Remote Business Analytics
>> > STaTS/StoreFront Remote
>> > HPE Storage
>> > Hewlett Packard Enterprise
>> > Mobile: +1 978-697-5315
>> >
>> >
>> >
>> >
>> > On 9/14/16, 1:56 PM, "choedebeck" <cr...@yahoo.com> wrote:
>> >
>> >     I was able to resolve this issue by encoding my data as Base64 before
>> >     sending, and then decoding the data I received from the PutTCP
>> > processor.
>> >     It is important to point out though that I was only able to do this
>> > because
>> >     I controlled both the data being sent.  I can imagine several use
>> cases
>> >     where this would not be the case.
>> >
>> >     This was very inconvenient, and at a minimum I believe the
>> > documentation
>> >     should be updated to spell out that if binary data is going to be
>> sent
>> >     across this processor, it must be encoded as text.  Ideally, there
>> > would be
>> >     some kind of setting that would allow me to send binary data across
>> > this
>> >     processor without requiring encoding.
>> >
>> >
>> >
>> >     --
>> >     View this message in context: http://apache-nifi-developer-
>> > list.39713.n7.nabble.com/Binary-Data-over-the-PutTCP-
>> > processor-tp13364p13365.html
>> >     Sent from the Apache NiFi Developer List mailing list archive at
>> > Nabble.com.
>> >
>> >
>> >
>> >
>>

Re: Binary Data over the PutTCP processor

Posted by Ryan Ward <ry...@gmail.com>.
Bryan - I started to evaluate putTCP and agree Outgoing Message Delimiter
should be optional. I'm currently having to splitText to 1 in order to use
putTCP which doesn't seem like the most efficient approach. Was a JIRA ever
created for this capability?

On Wed, Sep 14, 2016 at 4:41 PM, Bryan Bende <bb...@gmail.com> wrote:

> Hello,
>
> We should probably make "Outgoing Message Delimiter" optional when
> selecting "Connection Per Flow".
>
> Do you want to create a JIRA for this?
>
> Thanks,
>
> Bryan
>
> On Wed, Sep 14, 2016 at 3:36 PM, McDermott, Chris Kevin (MSDU -
> STaTS/StorefrontRemote) <ch...@hpe.com> wrote:
>
> > Hello friend,
> >
> > I’m not really a developer but I thought I might chime in while you wait
> > for a more “official” answer.
> >
> > Reading the original JIRA for the PutTCP processor and glancing at the
> > code, it seems to send the binary data unmolested.  The Character Set
> > setting (e.g. UTF-8) is only to allow the proper binary conversion of the
> > message delimiter to a byte sequence.  Yes, the documentation for the
> > Outgoing Message Delimiter could certainly be more specific.
> >
> > Did you try sending the binary data without a message delimiter?
> >
> > I hope this helps.
> >
> > Chris McDermott
> >
> > Remote Business Analytics
> > STaTS/StoreFront Remote
> > HPE Storage
> > Hewlett Packard Enterprise
> > Mobile: +1 978-697-5315
> >
> >
> >
> >
> > On 9/14/16, 1:56 PM, "choedebeck" <cr...@yahoo.com> wrote:
> >
> >     I was able to resolve this issue by encoding my data as Base64 before
> >     sending, and then decoding the data I received from the PutTCP
> > processor.
> >     It is important to point out though that I was only able to do this
> > because
> >     I controlled both the data being sent.  I can imagine several use
> cases
> >     where this would not be the case.
> >
> >     This was very inconvenient, and at a minimum I believe the
> > documentation
> >     should be updated to spell out that if binary data is going to be
> sent
> >     across this processor, it must be encoded as text.  Ideally, there
> > would be
> >     some kind of setting that would allow me to send binary data across
> > this
> >     processor without requiring encoding.
> >
> >
> >
> >     --
> >     View this message in context: http://apache-nifi-developer-
> > list.39713.n7.nabble.com/Binary-Data-over-the-PutTCP-
> > processor-tp13364p13365.html
> >     Sent from the Apache NiFi Developer List mailing list archive at
> > Nabble.com.
> >
> >
> >
> >
>

Re: Binary Data over the PutTCP processor

Posted by Bryan Bende <bb...@gmail.com>.
Hello,

We should probably make "Outgoing Message Delimiter" optional when
selecting "Connection Per Flow".

Do you want to create a JIRA for this?

Thanks,

Bryan

On Wed, Sep 14, 2016 at 3:36 PM, McDermott, Chris Kevin (MSDU -
STaTS/StorefrontRemote) <ch...@hpe.com> wrote:

> Hello friend,
>
> I’m not really a developer but I thought I might chime in while you wait
> for a more “official” answer.
>
> Reading the original JIRA for the PutTCP processor and glancing at the
> code, it seems to send the binary data unmolested.  The Character Set
> setting (e.g. UTF-8) is only to allow the proper binary conversion of the
> message delimiter to a byte sequence.  Yes, the documentation for the
> Outgoing Message Delimiter could certainly be more specific.
>
> Did you try sending the binary data without a message delimiter?
>
> I hope this helps.
>
> Chris McDermott
>
> Remote Business Analytics
> STaTS/StoreFront Remote
> HPE Storage
> Hewlett Packard Enterprise
> Mobile: +1 978-697-5315
>
>
>
>
> On 9/14/16, 1:56 PM, "choedebeck" <cr...@yahoo.com> wrote:
>
>     I was able to resolve this issue by encoding my data as Base64 before
>     sending, and then decoding the data I received from the PutTCP
> processor.
>     It is important to point out though that I was only able to do this
> because
>     I controlled both the data being sent.  I can imagine several use cases
>     where this would not be the case.
>
>     This was very inconvenient, and at a minimum I believe the
> documentation
>     should be updated to spell out that if binary data is going to be sent
>     across this processor, it must be encoded as text.  Ideally, there
> would be
>     some kind of setting that would allow me to send binary data across
> this
>     processor without requiring encoding.
>
>
>
>     --
>     View this message in context: http://apache-nifi-developer-
> list.39713.n7.nabble.com/Binary-Data-over-the-PutTCP-
> processor-tp13364p13365.html
>     Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>
>
>
>

Re: Binary Data over the PutTCP processor

Posted by choedebeck <cr...@yahoo.com>.
The message delimiter is a required field, and if I set it to an empty string
the processor would transition to an error state.  The documentation I read
seems to provide mixed messages on what this field does if you also have the
"1 message per connection" parameter set as well.

From what I was able to discover from my testing, what happens is 1 messages
does indeed get sent per connection, but the message delimiter would be
included at the end of the message.  Once I modified my receive code to
ignore the last byte I was able to verify the data was the same going out vs
coming in.

Perhaps I was wrong about the data being mangled, but encoding to Base64 did
fix all my issues.  There was something about my data that the processor
didn't like.  The only other thing I can think of is that the binary data
going across did have '\n' characters in it, but since I had 1 message per
connection I didn't think that mattered.



--
View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/Binary-Data-over-the-PutTCP-processor-tp13364p13367.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: Binary Data over the PutTCP processor

Posted by "McDermott, Chris Kevin (MSDU - STaTS/StorefrontRemote)" <ch...@hpe.com>.
Hello friend,

I’m not really a developer but I thought I might chime in while you wait for a more “official” answer.

Reading the original JIRA for the PutTCP processor and glancing at the code, it seems to send the binary data unmolested.  The Character Set setting (e.g. UTF-8) is only to allow the proper binary conversion of the message delimiter to a byte sequence.  Yes, the documentation for the Outgoing Message Delimiter could certainly be more specific.

Did you try sending the binary data without a message delimiter?

I hope this helps.

Chris McDermott
 
Remote Business Analytics
STaTS/StoreFront Remote
HPE Storage
Hewlett Packard Enterprise
Mobile: +1 978-697-5315
 



On 9/14/16, 1:56 PM, "choedebeck" <cr...@yahoo.com> wrote:

    I was able to resolve this issue by encoding my data as Base64 before
    sending, and then decoding the data I received from the PutTCP processor. 
    It is important to point out though that I was only able to do this because
    I controlled both the data being sent.  I can imagine several use cases
    where this would not be the case.
    
    This was very inconvenient, and at a minimum I believe the documentation
    should be updated to spell out that if binary data is going to be sent
    across this processor, it must be encoded as text.  Ideally, there would be
    some kind of setting that would allow me to send binary data across this
    processor without requiring encoding.
    
    
    
    --
    View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/Binary-Data-over-the-PutTCP-processor-tp13364p13365.html
    Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.
    



Re: Binary Data over the PutTCP processor

Posted by choedebeck <cr...@yahoo.com>.
I was able to resolve this issue by encoding my data as Base64 before
sending, and then decoding the data I received from the PutTCP processor. 
It is important to point out though that I was only able to do this because
I controlled both the data being sent.  I can imagine several use cases
where this would not be the case.

This was very inconvenient, and at a minimum I believe the documentation
should be updated to spell out that if binary data is going to be sent
across this processor, it must be encoded as text.  Ideally, there would be
some kind of setting that would allow me to send binary data across this
processor without requiring encoding.



--
View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/Binary-Data-over-the-PutTCP-processor-tp13364p13365.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.