You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Søren <sd...@syntonetic.com> on 2011/12/07 11:14:57 UTC

thrift to flume node

Hi all Flume users

We are looking in to sending content directly to a flume node using 
thrift. One of our sources is C# .NET app. I have seen in an earlier thread
http://mail-archives.apache.org/mod_mbox/incubator-flume-user/201109.mbox/browser
that this should be possible, but I haven't come across an exhaustive 
documentation targeting this particular use.
Is it working well?

I have collected the bits of information on the subject, and the only 
piece I seem to miss is where to find a
flume.thrift file compatible to flume-node v0.9.4, or alternatives 
sufficient for this use writing to a flume node from C#.
If anyone know about this, please share.

Thanks in advance
Søren


Re: thrift to flume node

Posted by Søren <sd...@syntonetic.com>.
It seems the flume.thrift is right.

We solved it ourselves by inspiration from various code examples:
http://blog.ozbuyucusu.com/2011/08/01/flume-rpcsource-example/
and Eran's C# code in an earlier post in this forum.

Case closed
brgs Søren

On 08/12/2011 11:01, Søren wrote:
> Thanks Matthew, that actually got us further.
> And thanks for the advices. We have already experienced hanging nodes 
> and are taking precautions until that is fixed.
>
> But something is not compliant in our setup.
>
> The recieving node (Flume for Windows version 0.9.4) is reporting :
> ------------------------
> 11/12/08 10:20:05 ERROR server.TSaneThreadPoolServer: Thrift error 
> occurred during processing of message.
> org.apache.thrift.protocol.TProtocolException: Missing version in 
> readMessageBegin, old client?
>     at 
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:213)
>     at 
> com.cloudera.flume.handlers.thrift.ThriftFlumeEventServer$Processor.process(ThriftFlumeEventServer.java:224)
>     at 
> org.apache.thrift.server.TSaneThreadPoolServer$WorkerProcess.run(TSaneThreadPoolServer.java:280)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
> Source)
>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>     at java.lang.Thread.run(Unknown Source)
> ------------------------
>
> The client is utilizing libs Thrift (0.8.0).
> Thrift Compiler (0.8.0)
> ThriftFlumeEvent build with this:
> https://svn.apache.org/repos/asf/incubator/flume/trunk/flume-core/src/main/thrift/flume.thrift
>
> Does anyone has a clue what is going wrong?
>
> Thanks in advance
> Søren
>
> On 07/12/2011 16:27, Matthew Rathbone wrote:
>> We're using thrift to send data to flume in production via 
>> Scala/Java. There are thrift bindings in the flume source code so you 
>> can generate your .net library. Check out the Client class, and the 
>> ThriftFlumeEvent class.
>>
>> It works, and it more efficient, but there are a couple of issues 
>> with it:
>> - we see the nodes 'hang' more often using this method (particularly 
>> on reconfiguration) - although they still report themselves as active.
>> - if you throw too many events at a node you're going to have to 
>> implement your own throttling to stop your process choking. We have a 
>> 10k item queue that overflows when the process becomes saturated.
>>
>> For the source you then use rpcSource(port)
>>
>> Hope that helps.
>>
>> -- 
>> Matthew Rathbone
>> Foursquare | Software Engineer | Server Engineering Team
>> matthew@foursquare.com <ma...@foursquare.com> | @rathboma 
>> <http://twitter.com/rathboma> | 4sq <http://foursquare.com/rathboma>
>>
>> On Wednesday, December 7, 2011 at 4:32 AM, alo alt wrote:
>>
>>> Hi,
>>>
>>> I played with, but was not really impressed from. In a past project 
>>> we wrote a .net module to send the content over syslog-tcp, setup a 
>>> syslog-ng host and wrote a filter definition inside. Then we used 
>>> flume as a tail-source.
>>>
>>> - alex
>>>
>>>
>>>
>>> On Wed, Dec 7, 2011 at 11:14 AM, Søren <sd@syntonetic.com 
>>> <ma...@syntonetic.com>> wrote:
>>>> Hi all Flume users
>>>>
>>>> We are looking in to sending content directly to a flume node using 
>>>> thrift. One of our sources is C# .NET app. I have seen in an 
>>>> earlier thread
>>>> http://mail-archives.apache.org/mod_mbox/incubator-flume-user/201109.mbox/browser
>>>> that this should be possible, but I haven't come across an 
>>>> exhaustive documentation targeting this particular use.
>>>> Is it working well?
>>>>
>>>> I have collected the bits of information on the subject, and the 
>>>> only piece I seem to miss is where to find a
>>>> flume.thrift file compatible to flume-node v0.9.4, or alternatives 
>>>> sufficient for this use writing to a flume node from C#.
>>>> If anyone know about this, please share.
>>>>
>>>> Thanks in advance
>>>> Søren
>>>>
>>>
>>>
>>>
>>> -- 
>>> Alexander Lorenz
>>> http://mapredit.blogspot.com
>>>
>>> *P **Think of the environment: please don't print this email unless 
>>> you really need to.*
>>>
>>>
>>

Re: thrift to flume node

Posted by Søren <sd...@syntonetic.com>.
Thanks Matthew, that actually got us further.
And thanks for the advices. We have already experienced hanging nodes 
and are taking precautions until that is fixed.

But something is not compliant in our setup.

The recieving node (Flume for Windows version 0.9.4) is reporting :
------------------------
11/12/08 10:20:05 ERROR server.TSaneThreadPoolServer: Thrift error 
occurred during processing of message.
org.apache.thrift.protocol.TProtocolException: Missing version in 
readMessageBegin, old client?
     at 
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:213)
     at 
com.cloudera.flume.handlers.thrift.ThriftFlumeEventServer$Processor.process(ThriftFlumeEventServer.java:224)
     at 
org.apache.thrift.server.TSaneThreadPoolServer$WorkerProcess.run(TSaneThreadPoolServer.java:280)
     at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown 
Source)
     at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
     at java.lang.Thread.run(Unknown Source)
------------------------

The client is utilizing libs Thrift (0.8.0).
Thrift Compiler (0.8.0)
ThriftFlumeEvent build with this:
https://svn.apache.org/repos/asf/incubator/flume/trunk/flume-core/src/main/thrift/flume.thrift

Does anyone has a clue what is going wrong?

Thanks in advance
Søren

On 07/12/2011 16:27, Matthew Rathbone wrote:
> We're using thrift to send data to flume in production via Scala/Java. 
> There are thrift bindings in the flume source code so you can generate 
> your .net library. Check out the Client class, and the 
> ThriftFlumeEvent class.
>
> It works, and it more efficient, but there are a couple of issues with it:
> - we see the nodes 'hang' more often using this method (particularly 
> on reconfiguration) - although they still report themselves as active.
> - if you throw too many events at a node you're going to have to 
> implement your own throttling to stop your process choking. We have a 
> 10k item queue that overflows when the process becomes saturated.
>
> For the source you then use rpcSource(port)
>
> Hope that helps.
>
> -- 
> Matthew Rathbone
> Foursquare | Software Engineer | Server Engineering Team
> matthew@foursquare.com <ma...@foursquare.com> | @rathboma 
> <http://twitter.com/rathboma> | 4sq <http://foursquare.com/rathboma>
>
> On Wednesday, December 7, 2011 at 4:32 AM, alo alt wrote:
>
>> Hi,
>>
>> I played with, but was not really impressed from. In a past project 
>> we wrote a .net module to send the content over syslog-tcp, setup a 
>> syslog-ng host and wrote a filter definition inside. Then we used 
>> flume as a tail-source.
>>
>> - alex
>>
>>
>>
>> On Wed, Dec 7, 2011 at 11:14 AM, Søren <sd@syntonetic.com 
>> <ma...@syntonetic.com>> wrote:
>>> Hi all Flume users
>>>
>>> We are looking in to sending content directly to a flume node using 
>>> thrift. One of our sources is C# .NET app. I have seen in an earlier 
>>> thread
>>> http://mail-archives.apache.org/mod_mbox/incubator-flume-user/201109.mbox/browser
>>> that this should be possible, but I haven't come across an 
>>> exhaustive documentation targeting this particular use.
>>> Is it working well?
>>>
>>> I have collected the bits of information on the subject, and the 
>>> only piece I seem to miss is where to find a
>>> flume.thrift file compatible to flume-node v0.9.4, or alternatives 
>>> sufficient for this use writing to a flume node from C#.
>>> If anyone know about this, please share.
>>>
>>> Thanks in advance
>>> Søren
>>>
>>
>>
>>
>> -- 
>> Alexander Lorenz
>> http://mapredit.blogspot.com
>>
>> *P **Think of the environment: please don't print this email unless 
>> you really need to.*
>>
>>
>

Re: thrift to flume node

Posted by Matthew Rathbone <ma...@foursquare.com>.
We're using thrift to send data to flume in production via Scala/Java. There are thrift bindings in the flume source code so you can generate your .net library. Check out the Client class, and the ThriftFlumeEvent class.  

It works, and it more efficient, but there are a couple of issues with it:
- we see the nodes 'hang' more often using this method (particularly on reconfiguration) - although they still report themselves as active.
- if you throw too many events at a node you're going to have to implement your own throttling to stop your process choking. We have a 10k item queue that overflows when the process becomes saturated.

For the source you then use rpcSource(port)

Hope that helps.  

--  
Matthew Rathbone
Foursquare | Software Engineer | Server Engineering Team
matthew@foursquare.com (mailto:matthew@foursquare.com) | @rathboma (http://twitter.com/rathboma) | 4sq (http://foursquare.com/rathboma)



On Wednesday, December 7, 2011 at 4:32 AM, alo alt wrote:

> Hi,
>  
> I played with, but was not really impressed from. In a past project we wrote a .net module to send the content over syslog-tcp, setup a syslog-ng host and wrote a filter definition inside. Then we used flume as a tail-source.   
>  
> - alex
>  
>  
>  
> On Wed, Dec 7, 2011 at 11:14 AM, Søren <sd@syntonetic.com (mailto:sd@syntonetic.com)> wrote:
> > Hi all Flume users
> >  
> > We are looking in to sending content directly to a flume node using thrift. One of our sources is C# .NET app. I have seen in an earlier thread
> > http://mail-archives.apache.org/mod_mbox/incubator-flume-user/201109.mbox/browser
> > that this should be possible, but I haven't come across an exhaustive documentation targeting this particular use.
> > Is it working well?
> >  
> > I have collected the bits of information on the subject, and the only piece I seem to miss is where to find a
> > flume.thrift file compatible to flume-node v0.9.4, or alternatives sufficient for this use writing to a flume node from C#.
> > If anyone know about this, please share.
> >  
> > Thanks in advance
> > Søren
> >  
>  
>  
>  
> --  
> Alexander Lorenz
> http://mapredit.blogspot.com
>  
> P Think of the environment: please don't print this email unless you really need to.  
>  
>  


Re: thrift to flume node

Posted by alo alt <wg...@googlemail.com>.
Hi,

I played with, but was not really impressed from. In a past project we
wrote a .net module to send the content over syslog-tcp, setup a syslog-ng
host and wrote a filter definition inside. Then we used flume as a
tail-source.

- alex



On Wed, Dec 7, 2011 at 11:14 AM, Søren <sd...@syntonetic.com> wrote:

> Hi all Flume users
>
> We are looking in to sending content directly to a flume node using
> thrift. One of our sources is C# .NET app. I have seen in an earlier thread
> http://mail-archives.apache.**org/mod_mbox/incubator-flume-**
> user/201109.mbox/browser<http://mail-archives.apache.org/mod_mbox/incubator-flume-user/201109.mbox/browser>
> that this should be possible, but I haven't come across an exhaustive
> documentation targeting this particular use.
> Is it working well?
>
> I have collected the bits of information on the subject, and the only
> piece I seem to miss is where to find a
> flume.thrift file compatible to flume-node v0.9.4, or alternatives
> sufficient for this use writing to a flume node from C#.
> If anyone know about this, please share.
>
> Thanks in advance
> Søren
>
>


-- 
Alexander Lorenz
http://mapredit.blogspot.com

*P **Think of the environment: please don't print this email unless you
really need to.*