You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Søren <sd...@syntonetic.com> on 2011/12/07 11:14:57 UTC
thrift to flume node
Hi all Flume users
We are looking in to sending content directly to a flume node using
thrift. One of our sources is C# .NET app. I have seen in an earlier thread
http://mail-archives.apache.org/mod_mbox/incubator-flume-user/201109.mbox/browser
that this should be possible, but I haven't come across an exhaustive
documentation targeting this particular use.
Is it working well?
I have collected the bits of information on the subject, and the only
piece I seem to miss is where to find a
flume.thrift file compatible to flume-node v0.9.4, or alternatives
sufficient for this use writing to a flume node from C#.
If anyone know about this, please share.
Thanks in advance
Søren
Re: thrift to flume node
Posted by Søren <sd...@syntonetic.com>.
It seems the flume.thrift is right.
We solved it ourselves by inspiration from various code examples:
http://blog.ozbuyucusu.com/2011/08/01/flume-rpcsource-example/
and Eran's C# code in an earlier post in this forum.
Case closed
brgs Søren
On 08/12/2011 11:01, Søren wrote:
> Thanks Matthew, that actually got us further.
> And thanks for the advices. We have already experienced hanging nodes
> and are taking precautions until that is fixed.
>
> But something is not compliant in our setup.
>
> The recieving node (Flume for Windows version 0.9.4) is reporting :
> ------------------------
> 11/12/08 10:20:05 ERROR server.TSaneThreadPoolServer: Thrift error
> occurred during processing of message.
> org.apache.thrift.protocol.TProtocolException: Missing version in
> readMessageBegin, old client?
> at
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:213)
> at
> com.cloudera.flume.handlers.thrift.ThriftFlumeEventServer$Processor.process(ThriftFlumeEventServer.java:224)
> at
> org.apache.thrift.server.TSaneThreadPoolServer$WorkerProcess.run(TSaneThreadPoolServer.java:280)
> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown
> Source)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
> at java.lang.Thread.run(Unknown Source)
> ------------------------
>
> The client is utilizing libs Thrift (0.8.0).
> Thrift Compiler (0.8.0)
> ThriftFlumeEvent build with this:
> https://svn.apache.org/repos/asf/incubator/flume/trunk/flume-core/src/main/thrift/flume.thrift
>
> Does anyone has a clue what is going wrong?
>
> Thanks in advance
> Søren
>
> On 07/12/2011 16:27, Matthew Rathbone wrote:
>> We're using thrift to send data to flume in production via
>> Scala/Java. There are thrift bindings in the flume source code so you
>> can generate your .net library. Check out the Client class, and the
>> ThriftFlumeEvent class.
>>
>> It works, and it more efficient, but there are a couple of issues
>> with it:
>> - we see the nodes 'hang' more often using this method (particularly
>> on reconfiguration) - although they still report themselves as active.
>> - if you throw too many events at a node you're going to have to
>> implement your own throttling to stop your process choking. We have a
>> 10k item queue that overflows when the process becomes saturated.
>>
>> For the source you then use rpcSource(port)
>>
>> Hope that helps.
>>
>> --
>> Matthew Rathbone
>> Foursquare | Software Engineer | Server Engineering Team
>> matthew@foursquare.com <ma...@foursquare.com> | @rathboma
>> <http://twitter.com/rathboma> | 4sq <http://foursquare.com/rathboma>
>>
>> On Wednesday, December 7, 2011 at 4:32 AM, alo alt wrote:
>>
>>> Hi,
>>>
>>> I played with, but was not really impressed from. In a past project
>>> we wrote a .net module to send the content over syslog-tcp, setup a
>>> syslog-ng host and wrote a filter definition inside. Then we used
>>> flume as a tail-source.
>>>
>>> - alex
>>>
>>>
>>>
>>> On Wed, Dec 7, 2011 at 11:14 AM, Søren <sd@syntonetic.com
>>> <ma...@syntonetic.com>> wrote:
>>>> Hi all Flume users
>>>>
>>>> We are looking in to sending content directly to a flume node using
>>>> thrift. One of our sources is C# .NET app. I have seen in an
>>>> earlier thread
>>>> http://mail-archives.apache.org/mod_mbox/incubator-flume-user/201109.mbox/browser
>>>> that this should be possible, but I haven't come across an
>>>> exhaustive documentation targeting this particular use.
>>>> Is it working well?
>>>>
>>>> I have collected the bits of information on the subject, and the
>>>> only piece I seem to miss is where to find a
>>>> flume.thrift file compatible to flume-node v0.9.4, or alternatives
>>>> sufficient for this use writing to a flume node from C#.
>>>> If anyone know about this, please share.
>>>>
>>>> Thanks in advance
>>>> Søren
>>>>
>>>
>>>
>>>
>>> --
>>> Alexander Lorenz
>>> http://mapredit.blogspot.com
>>>
>>> *P **Think of the environment: please don't print this email unless
>>> you really need to.*
>>>
>>>
>>
Re: thrift to flume node
Posted by Søren <sd...@syntonetic.com>.
Thanks Matthew, that actually got us further.
And thanks for the advices. We have already experienced hanging nodes
and are taking precautions until that is fixed.
But something is not compliant in our setup.
The recieving node (Flume for Windows version 0.9.4) is reporting :
------------------------
11/12/08 10:20:05 ERROR server.TSaneThreadPoolServer: Thrift error
occurred during processing of message.
org.apache.thrift.protocol.TProtocolException: Missing version in
readMessageBegin, old client?
at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:213)
at
com.cloudera.flume.handlers.thrift.ThriftFlumeEventServer$Processor.process(ThriftFlumeEventServer.java:224)
at
org.apache.thrift.server.TSaneThreadPoolServer$WorkerProcess.run(TSaneThreadPoolServer.java:280)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown
Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
------------------------
The client is utilizing libs Thrift (0.8.0).
Thrift Compiler (0.8.0)
ThriftFlumeEvent build with this:
https://svn.apache.org/repos/asf/incubator/flume/trunk/flume-core/src/main/thrift/flume.thrift
Does anyone has a clue what is going wrong?
Thanks in advance
Søren
On 07/12/2011 16:27, Matthew Rathbone wrote:
> We're using thrift to send data to flume in production via Scala/Java.
> There are thrift bindings in the flume source code so you can generate
> your .net library. Check out the Client class, and the
> ThriftFlumeEvent class.
>
> It works, and it more efficient, but there are a couple of issues with it:
> - we see the nodes 'hang' more often using this method (particularly
> on reconfiguration) - although they still report themselves as active.
> - if you throw too many events at a node you're going to have to
> implement your own throttling to stop your process choking. We have a
> 10k item queue that overflows when the process becomes saturated.
>
> For the source you then use rpcSource(port)
>
> Hope that helps.
>
> --
> Matthew Rathbone
> Foursquare | Software Engineer | Server Engineering Team
> matthew@foursquare.com <ma...@foursquare.com> | @rathboma
> <http://twitter.com/rathboma> | 4sq <http://foursquare.com/rathboma>
>
> On Wednesday, December 7, 2011 at 4:32 AM, alo alt wrote:
>
>> Hi,
>>
>> I played with, but was not really impressed from. In a past project
>> we wrote a .net module to send the content over syslog-tcp, setup a
>> syslog-ng host and wrote a filter definition inside. Then we used
>> flume as a tail-source.
>>
>> - alex
>>
>>
>>
>> On Wed, Dec 7, 2011 at 11:14 AM, Søren <sd@syntonetic.com
>> <ma...@syntonetic.com>> wrote:
>>> Hi all Flume users
>>>
>>> We are looking in to sending content directly to a flume node using
>>> thrift. One of our sources is C# .NET app. I have seen in an earlier
>>> thread
>>> http://mail-archives.apache.org/mod_mbox/incubator-flume-user/201109.mbox/browser
>>> that this should be possible, but I haven't come across an
>>> exhaustive documentation targeting this particular use.
>>> Is it working well?
>>>
>>> I have collected the bits of information on the subject, and the
>>> only piece I seem to miss is where to find a
>>> flume.thrift file compatible to flume-node v0.9.4, or alternatives
>>> sufficient for this use writing to a flume node from C#.
>>> If anyone know about this, please share.
>>>
>>> Thanks in advance
>>> Søren
>>>
>>
>>
>>
>> --
>> Alexander Lorenz
>> http://mapredit.blogspot.com
>>
>> *P **Think of the environment: please don't print this email unless
>> you really need to.*
>>
>>
>
Re: thrift to flume node
Posted by Matthew Rathbone <ma...@foursquare.com>.
We're using thrift to send data to flume in production via Scala/Java. There are thrift bindings in the flume source code so you can generate your .net library. Check out the Client class, and the ThriftFlumeEvent class.
It works, and it more efficient, but there are a couple of issues with it:
- we see the nodes 'hang' more often using this method (particularly on reconfiguration) - although they still report themselves as active.
- if you throw too many events at a node you're going to have to implement your own throttling to stop your process choking. We have a 10k item queue that overflows when the process becomes saturated.
For the source you then use rpcSource(port)
Hope that helps.
--
Matthew Rathbone
Foursquare | Software Engineer | Server Engineering Team
matthew@foursquare.com (mailto:matthew@foursquare.com) | @rathboma (http://twitter.com/rathboma) | 4sq (http://foursquare.com/rathboma)
On Wednesday, December 7, 2011 at 4:32 AM, alo alt wrote:
> Hi,
>
> I played with, but was not really impressed from. In a past project we wrote a .net module to send the content over syslog-tcp, setup a syslog-ng host and wrote a filter definition inside. Then we used flume as a tail-source.
>
> - alex
>
>
>
> On Wed, Dec 7, 2011 at 11:14 AM, Søren <sd@syntonetic.com (mailto:sd@syntonetic.com)> wrote:
> > Hi all Flume users
> >
> > We are looking in to sending content directly to a flume node using thrift. One of our sources is C# .NET app. I have seen in an earlier thread
> > http://mail-archives.apache.org/mod_mbox/incubator-flume-user/201109.mbox/browser
> > that this should be possible, but I haven't come across an exhaustive documentation targeting this particular use.
> > Is it working well?
> >
> > I have collected the bits of information on the subject, and the only piece I seem to miss is where to find a
> > flume.thrift file compatible to flume-node v0.9.4, or alternatives sufficient for this use writing to a flume node from C#.
> > If anyone know about this, please share.
> >
> > Thanks in advance
> > Søren
> >
>
>
>
> --
> Alexander Lorenz
> http://mapredit.blogspot.com
>
> P Think of the environment: please don't print this email unless you really need to.
>
>
Re: thrift to flume node
Posted by alo alt <wg...@googlemail.com>.
Hi,
I played with, but was not really impressed from. In a past project we
wrote a .net module to send the content over syslog-tcp, setup a syslog-ng
host and wrote a filter definition inside. Then we used flume as a
tail-source.
- alex
On Wed, Dec 7, 2011 at 11:14 AM, Søren <sd...@syntonetic.com> wrote:
> Hi all Flume users
>
> We are looking in to sending content directly to a flume node using
> thrift. One of our sources is C# .NET app. I have seen in an earlier thread
> http://mail-archives.apache.**org/mod_mbox/incubator-flume-**
> user/201109.mbox/browser<http://mail-archives.apache.org/mod_mbox/incubator-flume-user/201109.mbox/browser>
> that this should be possible, but I haven't come across an exhaustive
> documentation targeting this particular use.
> Is it working well?
>
> I have collected the bits of information on the subject, and the only
> piece I seem to miss is where to find a
> flume.thrift file compatible to flume-node v0.9.4, or alternatives
> sufficient for this use writing to a flume node from C#.
> If anyone know about this, please share.
>
> Thanks in advance
> Søren
>
>
--
Alexander Lorenz
http://mapredit.blogspot.com
*P **Think of the environment: please don't print this email unless you
really need to.*