You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by I PVP <ip...@hotmail.com> on 2017/09/11 22:08:51 UTC
sharing across Bolts
What is the best practice approach to share, across bolts, a Collection that will be used by many bolts each will perform a specific summarization and statistics calculation.
The objective is to retrieve the collection only once , instead of retrieving from each for each bolt.
Should I just emit the collection from the intermediary bolt or is there a better way something like a internal cache ?
The overall topology approach is , using fieldsGrouping:
---
1)KafkaSpout
Receives the identifier(UUID) that will drive the retrieval of a collection of retail transactions. example: List<Transaction>
2) Bolt
Retrieves and emitts (collector.emit) the collection of transactions that will be subjet to multiple calculations ( Is this correct or could cause a memory issue as the number of Bolts growth ?)
3) Around 6 other Bolts should use that same collection of transactions to execute different types of summarization and statistics calculation and write the metrics to Cassandra.
---
Thanks
IPVP