You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@storm.apache.org by "Alexander Kharitonov (JIRA)" <ji...@apache.org> on 2016/12/02 13:01:58 UTC
[jira] [Updated] (STORM-2231) NULL in DisruptorQueue while multi-threaded ack

     [ https://issues.apache.org/jira/browse/STORM-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alexander Kharitonov updated STORM-2231:
----------------------------------------
    Description: 
I use simple topology with one spout (9 workers) and one bolt (9 workers).
I have topology.backpressure.enable: false in storm.yaml.
Spouts send about 10 000 000 tuples in 10 minutes. Pending for spout is 80 000.
Bolts buffer theirs tuples for 60 seconds and flush to database and ack tuples in parallel (10 threads).
I read that OutputCollector can be used in many threads safely, so i use it.
I don't have any bottleneck in bolts(flushing to database) or spouts(kafka spout), but about 2% of tuples fail due to tuple processing timeout (fails are recordered in spout stats only).
I am sure that bolts ack all tuples. But some of acks don't come to spouts.
While multi-threaded acking i see many errors in worker logs like that:
2016-12-01 13:21:10.741 o.a.s.u.DisruptorQueue [ERROR] NULL found in disruptor-executor[3 3]-send-queue:853877

I tried to use synchronized wrapper around OutputCollector to fix the error. But it didn't help.

I found the workaround that helps me: i do all processing in bolt in multiple threads but call OutputCollector.ack methods in a one single separate thread.

I think Storm has an error in the multi-threaded use of OutputCollector.

If my topology has much less load, like 500 000 tuples per 10 minutes, then  i don't loss any acks.

  was:
I use simple topology with one spout (9 workers) and one bolt (9 workers).
Spouts send about 10 000 000 tuples in 10 minutes. Pending for spout is 80 000.
Bolts buffer theirs tuples for 60 seconds and flush to database and ack tuples in parallel (10 threads).
I read that OutputCollector can be used in many threads safely, so i use it.
I don't have any bottleneck in bolts(flushing to database) or spouts(kafka spout), but about 2% of tuples fail due to tuple processing timeout (fails are recordered in spout stats only).
I am sure that bolts ack all tuples. But some of acks don't come to spouts.
While multi-threaded acking i see many errors in worker logs like that:
2016-12-01 13:21:10.741 o.a.s.u.DisruptorQueue [ERROR] NULL found in disruptor-executor[3 3]-send-queue:853877

I tried to use synchronized wrapper around OutputCollector to fix the error. But it didn't help.

I found the workaround that helps me: i do all processing in bolt in multiple threads but call OutputCollector.ack methods in a one single separate thread.

I think Storm has an error in the multi-threaded use of OutputCollector.

If my topology has much less load, like 500 000 tuples per 10 minutes, then  i don't loss any acks.


> NULL in DisruptorQueue while multi-threaded ack
> -----------------------------------------------
>
>                 Key: STORM-2231
>                 URL: https://issues.apache.org/jira/browse/STORM-2231
>             Project: Apache Storm
>          Issue Type: Bug
>          Components: storm-core
>    Affects Versions: 1.0.1
>            Reporter: Alexander Kharitonov
>            Priority: Critical
>
> I use simple topology with one spout (9 workers) and one bolt (9 workers).
> I have topology.backpressure.enable: false in storm.yaml.
> Spouts send about 10 000 000 tuples in 10 minutes. Pending for spout is 80 000.
> Bolts buffer theirs tuples for 60 seconds and flush to database and ack tuples in parallel (10 threads).
> I read that OutputCollector can be used in many threads safely, so i use it.
> I don't have any bottleneck in bolts(flushing to database) or spouts(kafka spout), but about 2% of tuples fail due to tuple processing timeout (fails are recordered in spout stats only).
> I am sure that bolts ack all tuples. But some of acks don't come to spouts.
> While multi-threaded acking i see many errors in worker logs like that:
> 2016-12-01 13:21:10.741 o.a.s.u.DisruptorQueue [ERROR] NULL found in disruptor-executor[3 3]-send-queue:853877
> I tried to use synchronized wrapper around OutputCollector to fix the error. But it didn't help.
> I found the workaround that helps me: i do all processing in bolt in multiple threads but call OutputCollector.ack methods in a one single separate thread.
> I think Storm has an error in the multi-threaded use of OutputCollector.
> If my topology has much less load, like 500 000 tuples per 10 minutes, then  i don't loss any acks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)