You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flume.apache.org by Chen Wang <ch...@gmail.com> on 2014/01/09 23:38:36 UTC

seeking help on flume cluster deployment

Guys,
In my environment, the client is 5 socket servers. Thus i wrote a custom
source spawning 5 threads reading from each of them infinitely,and the sink
is hdfs(hive table). The work fine by running flume-ng agent.

But how can i deploy this in distributed mode(cluster)? I am confused about
the 3 ties(agent,collector,storage) mentioned in the doc. Does it apply to
my case? How can I separate my agent/collect/storage? Apparently i can only
have one agent running: multiple agent will result in getting duplicates
from the socket server. But I want that if one agent dies, other agent can
take it up. I would also like to be able to add horizontal scalability for
writing to hdfs. How can I achieve all this?

thank you very much for your advice.
Chen