You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flume.apache.org by GitBox <gi...@apache.org> on 2022/11/02 11:02:03 UTC

[GitHub] [flume] tmgstevens commented on pull request #351: FLUME-3415 - Provide Docker image for Flume

tmgstevens commented on PR #351:
URL: https://github.com/apache/flume/pull/351#issuecomment-1300061047

   So a couple of points in here:
   
   > It packages all the artifacts whether they are required or not.
   
   True - packaging is something that I think we need to think about going forwards. For people who want a lightweight deployment, should we offer different profiles, the flipside being that if you're combining two components from different modules (e.g. syslog and kafka, HTTP and HDFS etc) then actually do you ever get the benefit of the modularity, or does it just ramp up your complexity (complexity = adoption blocker in my mind).
   
   > Unless I am mistaken, it is getting the distribution tar and using the configuration located within that. I fail to see how useful that will be as I would expect most users would have a custom configuration.
   
   It does bundle the default conf directory, but it is anticipated that a user would re-map that or pass config in via environment variables (which could then include secrets). Both designed to work in docker and kubernetes. There's an example of doing that here: https://github.com/apache/flume/blob/d2bd7812dbacd86459726c0fd3dc774272ce0222/flume-ng-tests/src/test/java/org/apache/flume/test/util/DockerInstall.java#L137-L153
   
   > I think starting from "everything goes in the bucket" to match our historic deployment shape is fine. I agree that eventually we need to get to a more modular approach that provides easy examples for folks building just the parts they need.
   
   +1
   
   > There's some tooling around for easier container image building based on spring boot applications, e.g. Jib. Maybe we should take an approach that leverages that?
   
   Personally I'd rather not re-write the docker deployment right now given that what's there works pretty well. We could look to move away from the spotify plugin to something else, but I don't want to re-architect the whole packaging of Flume at the moment.
   
   > To be clear, I use the FileChannel, which pretty much requires fast disk (i.e. SSDs or equivalent) to perform well. It also requires a dedicated "disk" so that data isn't lost on restart. This doesn't really work well with Docker/Kubernetes so we don't use it for Flume.
   
   I actually think this would be fine - you can have persistent disks in Kubernetes assigned to pods, same applies to docker where you can mount a volume. Would I necessarily recommend that you re-write your deployment model to use containers? No. But for example, in @busbey 's world where he might be moving from a previously managed deployment to something that needs additional orchestration, using Docker or Kubernetes and deploying agents across many nodes, this could make things much easier to manage and maintain.
   
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@flume.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org