You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flume.apache.org by Javi Roman <ja...@redoop.org> on 2014/09/02 09:54:11 UTC

Flume scaling best practices question

Hi!

I'm using this flume-snmp-source plugin [1] to query a managed host
with the following logic:

agent.sources.source1.type = org.apache.flume.source.SNMPQuerySource
agent.sources.source1.host = 23.23.52.11
agent.sources.source1.port = 161
agent.sources.source1.delay = 30
agent.sources.source1.oid1 = 1.3.6.1.4.1.2000.1.2.5.1.3
agent.sources.source1.oid2 = 1.3.6.1.4.1.2000.1.2.5.1.7
agent.sources.source1.oid3 = 1.3.6.1.4.1.2000.1.2.5.1.9
agent.sources.source1.oid4 = 1.3.6.1.4.1.2000.1.2.5.1.10
agent.sources.source1.oid5 = 1.3.6.1.4.1.2000.1.2.5.1.12
agent.sources.source1.oid6 = 1.3.6.1.4.1.2000.1.2.5.1.13
....
agent.sources.source1.oidN = N.N.N.N ...


The plugin is a PollableSource source and is quering every
"source1.delay" seconds to the managed host "source1.host". The
message passed to the Flume channel is created with this format:

"current date, ip managed device, oid1 answer, oid2 anwswer, ...., oidN answer"

The query is made using SNMP GETBULK for performance reasons.

The plugin works fine (the development is in alpha stage), however I'm
going ahead with more robust tests, at this point I have the following
question related with Flume scaling:

I have to query +1K managed devices with the same snmp query, so I
have to created a "source" entry for each host to query in the
flume.conf file. This is the correct way to do that, to maintain a
huge "flume.conf" file with thousand of entries? Otherwise is there a
better strategy for this big scale problems?

Many thanks!

[1] https://github.com/javiroman/flume-snmp-source