You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2019/12/10 23:14:17 UTC
[GitHub] [incubator-druid] dene14 opened a new issue #9011:
OrderedPartitionableRecord buffer full, storing iterator and retrying
dene14 opened a new issue #9011: OrderedPartitionableRecord buffer full, storing iterator and retrying
URL: https://github.com/apache/incubator-druid/issues/9011
Today I've got very strange issue (not a first time actually) when all middlemanagers stopped to consume data from Kinesis stream for a long period (>40 minutes).
The only message that was printed to tasks logs are these:
```
2019-12-10T21:36:22,218 INFO [appenderator_persist_0] org.apache.druid.segment.realtime.appenderator.StreamAppenderatorDriver - Successfully handed off [1] segments.
2019-12-10T21:36:27,100 WARN [KinesisRecordSupplier-Worker-18] org.apache.druid.indexing.kinesis.KinesisRecordSupplier - OrderedPartitionableRecord buffer full, storing iterator and retrying in [5,000ms]
.............
2019-12-10T21:40:21,249 WARN [KinesisRecordSupplier-Worker-13] org.apache.druid.indexing.kinesis.KinesisRecordSupplier - OrderedPartitionableRecord buffer full, storing iterator and retrying in [5,000ms]
2019-12-10T21:40:21,500 INFO [intermediary-data-manager-0] org.apache.druid.indexing.worker.IntermediaryDataManager - Discovered partitions for [0] new supervisor tasks under location[var/tmp/intermediary-segments]
2019-12-10T21:40:21,503 INFO [intermediary-data-manager-0] org.apache.druid.indexing.worker.IntermediaryDataManager - Found [0] expired supervisor tasks
.............
2019-12-10T21:50:39,061 WARN [HttpClient-Netty-Worker-3] org.apache.druid.java.util.http.client.pool.ResourcePool - Resource at key[http://prod-druid-overlord-0.prod-druid-overlord-headless.prod-druid.svc.cluster.local:8090] was returned multiple times?
2019-12-10T21:50:39,061 WARN [task-runner-0-priority-0] org.apache.druid.discovery.DruidLeaderClient - Request[http://prod-druid-overlord-0.prod-druid-overlord-headless.prod-druid.svc.cluster.local:8090/druid/indexer/v1/action] failed.
org.jboss.netty.handler.timeout.ReadTimeoutException
at org.jboss.netty.handler.timeout.ReadTimeoutHandler.<clinit>(ReadTimeoutHandler.java:84) ~[netty-3.10.6.Final.jar:?]
at org.apache.druid.java.util.http.client.NettyHttpClient.go(NettyHttpClient.java:172) ~[druid-core-0.16.0-incubating.jar:0.16.0-incubating]
at org.apache.druid.java.util.http.client.AbstractHttpClient.go(AbstractHttpClient.java:33) ~[druid-core-0.16.0-incubating.jar:0.16.0-incubating]
at org.apache.druid.discovery.DruidLeaderClient.go(DruidLeaderClient.java:173) ~[druid-server-0.16.0-incubating.jar:0.16.0-incubating]
at org.apache.druid.discovery.DruidLeaderClient.go(DruidLeaderClient.java:145) ~[druid-server-0.16.0-incubating.jar:0.16.0-incubating]
at org.apache.druid.query.lookup.LookupReferencesManager.fetchLookupsForTier(LookupReferencesManager.java:569) ~[druid-server-0.16.0-incubating.jar:0.16.0-incubating]
at org.apache.druid.query.lookup.LookupReferencesManager.tryGetLookupListFromCoordinator(LookupReferencesManager.java:422) ~[druid-server-0.16.0-incubating.jar:0.16.0-incubating]
at org.apache.druid.query.lookup.LookupReferencesManager.lambda$getLookupListFromCoordinator$4(LookupReferencesManager.java:400) ~[druid-server-0.16.0-incubating.jar:0.16.0-incubating]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:86) ~[druid-core-0.16.0-incubating.jar:0.16.0-incubating]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:114) ~[druid-core-0.16.0-incubating.jar:0.16.0-incubating]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:104) ~[druid-core-0.16.0-incubating.jar:0.16.0-incubating]
at org.apache.druid.query.lookup.LookupReferencesManager.getLookupListFromCoordinator(LookupReferencesManager.java:390) ~[druid-server-0.16.0-incubating.jar:0.16.0-incubating]
at org.apache.druid.query.lookup.LookupReferencesManager.getLookupsList(LookupReferencesManager.java:367) ~[druid-server-0.16.0-incubating.jar:0.16.0-incubating]
at org.apache.druid.query.lookup.LookupReferencesManager.loadAllLookupsAndInitStateRef(LookupReferencesManager.java:350) ~[druid-server-0.16.0-incubating.jar:0.16.0-incubating]
at org.apache.druid.query.lookup.LookupReferencesManager.start(LookupReferencesManager.java:156) ~[druid-server-0.16.0-incubating.jar:0.16.0-incubating]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_202]
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_202]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_202]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_202]
at org.apache.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler.start(Lifecycle.java:442) ~[druid-core-0.16.0-incubating.jar:0.16.0-incubating]
at org.apache.druid.java.util.common.lifecycle.Lifecycle.start(Lifecycle.java:339) ~[druid-core-0.16.0-incubating.jar:0.16.0-incubating]
at org.apache.druid.guice.LifecycleModule$2.start(LifecycleModule.java:140) ~[druid-core-0.16.0-incubating.jar:0.16.0-incubating]
at org.apache.druid.cli.GuiceRunnable.initLifecycle(GuiceRunnable.java:115) ~[druid-services-0.16.0-incubating.jar:0.16.0-incubating]
at org.apache.druid.cli.CliPeon.run(CliPeon.java:273) ~[druid-services-0.16.0-incubating.jar:0.16.0-incubating]
at org.apache.druid.cli.Main.main(Main.java:113) ~[druid-services-0.16.0-incubating.jar:0.16.0-incubating]
.............
2019-12-10T22:18:28,614 WARN [KinesisRecordSupplier-Worker-19] org.apache.druid.indexing.kinesis.KinesisRecordSupplier - OrderedPartitionableRecord buffer full, storing iterator and retrying in [5,000ms]
2019-12-10T22:18:29,343 INFO [task-runner-0-priority-0] org.apache.druid.segment.realtime.appenderator.BaseAppenderatorDriver - New segment[midas_2019-12-10T21:00:00.000Z_2019-12-10T22:00:00.000Z_2019-12-10T21:00:04.939Z_54] for row[MapBasedInputRow{timestamp=2019-12-10T21:35:33.144Z, event={app_id=............
```
### Affected Version
0.16.0
### Description
Please include as much detailed information about the problem as possible.
- Cluster size
8 middlemanagers x 2 peons
1 overlord
1 coordinator
- Steps to reproduce the problem
happens sporadically, no issues reported from kinesis side (no throttling or other issues)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org