You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by nitin sharma <ku...@gmail.com> on 2015/06/04 19:58:31 UTC

How to count Tuples in each batch

Hi All,

I would like to know how many tuples are getting processed in a batch... is
there a way to do so..

may be something i can code in execute() method and print  in log?

Regards,
Nitin Kumar Sharma.

Re: How to count Tuples in each batch

Posted by Ashish Soni <as...@gmail.com>.
Try below as it will print count by batch .


TridentTopology topology = new TridentTopology();
topology.newStream("cdrevent", new CSVSpout("testdata.csv", ',',
false)).partitionBy(new Fields("field_1")).
 groupBy(new Fields("field_1"))
.aggregate(new Fields("field_1"), new Count(),new Fields("count"))
.each(new Fields("field_1","count"), new Utils.PrintFilter());
 Config config = new Config();
config.put(RichSpoutBatchExecutor.MAX_BATCH_SIZE_CONF, 200);
 LocalCluster cluster = new LocalCluster();
 cluster.submitTopology("cdreventTopology", config, topology.build());
 backtype.storm.utils.Utils.sleep(10000);
cluster.killTopology("cdreventTopology");



public static class PrintFilter implements Filter {

@Override
public void prepare(Map conf, TridentOperationContext context) {
}
@Override
public void cleanup() {
}

@Override
public boolean isKeep(TridentTuple tuple) {
System.out.println(tuple);
return true;
}

On Thu, Jun 4, 2015 at 1:58 PM, nitin sharma <ku...@gmail.com>
wrote:

> Hi All,
>
> I would like to know how many tuples are getting processed in a batch...
> is there a way to do so..
>
> may be something i can code in execute() method and print  in log?
>
> Regards,
> Nitin Kumar Sharma.
>
>