You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@camel.apache.org by GitBox <gi...@apache.org> on 2020/01/13 18:25:22 UTC

[GitHub] [camel-k] davesargrad opened a new issue #1196: Performance Question On Camel-K

davesargrad opened a new issue #1196: Performance Question On Camel-K
URL: https://github.com/apache/camel-k/issues/1196

We are looking at a system where we will need to run hundreds, perhaps even thousands, of camel-k integrations. I am trying to understand the overhead associated with a single camel-k integration.

I'm assuming that 100 integrations would result in 100 instances of a camel-k based pod. This means each pod would require a single JVM, hence 100 JVM instances.

I could envision an alternative architectural approach that would perform the same 100 functions in a single pod, as long as that pod is running all 100 routes.

Is it possible to create an integration that implements N routes, rather than a single route per integration?

Can you please help us to understand the way to most efficiently architect a camel-k based solution that must perform the equivalent of 100's or 1000's of routes?

What are some of the things that you can offer us relative to helping us come up with an appropriate performance focused camel-k based architecture?

I'm interested in understanding how to architect the system to find the right architectural balance with a focus on vCPU and RAM resources.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services

[GitHub] [camel-k] davsclaus commented on issue #1196: Performance Question On Camel-K

Posted by GitBox <gi...@apache.org>.

davsclaus commented on issue #1196: Performance Question On Camel-K
URL: https://github.com/apache/camel-k/issues/1196#issuecomment-574663419
 
 
   @nicolaferraro there are some great bits here, maybe you/we could put together a blog post and get it posted on your blog + camel website.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [camel-k] davesargrad commented on issue #1196: Performance Question On Camel-K

Posted by GitBox <gi...@apache.org>.

davesargrad commented on issue #1196: Performance Question On Camel-K
URL: https://github.com/apache/camel-k/issues/1196#issuecomment-574662442
 
 
   To @nicolaferraro 
   
   This is simply awesome architectural feedback. I appreciate it greatly. I understand the tradeoffs you describe, and indeed I do see huge value in the microservices architecture. I will resist the temptation to build an integration monolith that performs 1000's of routes.
   
   I think you are correct that we may end up with a balanced set of integrations, several routes per integration, where these routes are related in some fashion. 
   
   In the end game we will have 1000's or even tens of 1000's of routes to implement, and my goal is to drive a sensible architecture that doesnt require excessive VM resources. Your reasoning will be at the core of my thinking.
   
   I will look into quarkus. We are interested in such game changers. It is good to know that you are also focused on this.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [camel-k] nicolaferraro commented on issue #1196: Performance Question On Camel-K

Posted by GitBox <gi...@apache.org>.

nicolaferraro commented on issue #1196: Performance Question On Camel-K
URL: https://github.com/apache/camel-k/issues/1196#issuecomment-574416039

I can try to give some ideas to reason on, but there's no general rule valid for all scenarios.

You can run multiple routes in the same JVM, just with `kamel run Routes1.java Routes2.java RoutesN.java --name routes-pack`.

The question is when you want to do that and why. I think you should apply the same reasoning that people use when dealing with microservices architectures, where the big integration containing 1000 routes is the `monolith`.

E.g. do you want independent scalability of some integration flows? So deploy them separately in order to set the number of replicas independently. If you use Knative they will scale automatically depending on the load, but if you have a single fat integration containing all routes, you need to scale the whole stuff which is heavyweight (takes more time to startup and uses much more resources than needed).

E.g. do you have multiple teams? If so you probably want each team to be responsible for their own deployments and not having to synchronize with other teams. You also may want that each update on one single integration not to interfere with other integrations already running (but they will do if they are on the same JVM).

I think many other principles that apply to microservices apply also here. You'll end up somewhere in the middle between a single fat integration and an integration per route.

The long-term goal of Camel K is to allow you to split based on domain logic rather than resource utilization, by drastically reducing the amount of resource needed. We've already done some work on reducing the amount of resources used in the cluster and we'll do a lot more.

What you've now:
- Knative services available for HTTP based endpoints: they shut down the JVM when not used
- CronJob (fresh #1197): they activate a JVM only when they need to run

What we're working on:
- Quarkus native compilation: so you don't run a full JVM but a tiny binary for each integration which uses resources comparable to that of a golang application
- Keda autoscalers: to run integration only when they need to process data and stop them when idle

In particular, Quarkus is a game changer. The [Camel-quarkus](https://github.com/apache/camel-quarkus/tree/master/extensions) repository already contains a lot of Camel components that can compile to native (Camel K will be able to compile to native transparently in one of next releases, we're working on it).

If I had to run 1000 integration flows, the first thing I would consider would be to check if camel components I need to use are in the list of Quarkus extensions and contribute what's missing.. That would allow me to care less about non-functional requirements and focus on business logic and maintainability.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

With regards,
Apache Git Services